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ABSTRACT 


System  security  and  information  assurance  requirements  and  specifications 
incorporated  into  the  architectural  design  of  a  network  enterprise  must  be  driven  by  an 
adaptable  and  evolving  network  enterprise  risk  management  plan.  Network  Risk 
Management  must  start  at  concept  design  and  relate  to  the  network’s  Concept  of 
Operations.  The  purpose  of  this  thesis  is  to  examine  some  of  the  essential  elements 
necessary  in  a  network  enterprise  risk  management  plan  for  a  complex  global  networked 
system  similar  to  the  Global  Information  Grid  (GIG).  It  compares  the  current  Department 
of  Defense  (DoD)  framework  for  risk  management  with  other  popular  network  risk 
management  process  models.  An  important  but  difficult  part  of  the  risk  management 
process  is  determining  the  value  of  network  assets.  Another  important,  but  overlooked, 
element  of  risk  management  processes,  is  evaluating  the  network  for  resiliency;  the 
ability  to  return  to  normal  in  time  to  prevent  the  compromise  of  a  mission.  The  contention 
is  that  risk  management  planning  must  include  planning  for  network  survivability  and 
resiliency.  Selected  elementary  network  architectures  are  analyzed  for  attributes  of  the 
architectures  that  promote  information  assurance  qualities  of  confidentiality,  integrity, 
and  availability.  Finally,  recommendations  are  made  on  applying  important  elements  of 
network  risk  management  into  the  conceptual  architecture  of  a  global  network. 
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EXECUTIVE  SUMMARY 


The  Department  of  Defense  and  the  military  serviees  have  clearly  articulated  their 
vision  of  information  operations  and  the  use  of  networking  as  the  future  strategy  for 
military  operations  based  on  knowledge  superiority.  Computers  and  the  automated 
processing  capability  inherent  in  them  used  to  be  seen  as  one  of  the  technology  tools  in  a 
tool  set  to  increase  the  effectiveness  of  major  weapons  platforms  and  maneuver  warfare, 
including  increasing  capability  in  operations,  training,  logistics,  and  communications 
within  the  different  warfare  areas.  Today,  that  vision  has  expanded  to  raise  the 
precedence  of  information  technology  and  networking  power  and  put  it  in  a  warfare  area 
of  its  own.  Information  and  its  collection,  manipulation,  distribution,  use,  and  protection 
are  considered  vital  to  the  future  of  warfare  and  are  central  to  the  defense  strategy  of  the 
nation.  Concomitant  with  the  increase  of  importance  of  information  superiority,  and  Net¬ 
centric  Operations,  the  reality  is  that  the  systems  that  make  this  vision  a  reality  are 
continuously  threatened  by  malicious  entities  that  use  viably  ingenious  ways  to  gain 
access  to  those  information  systems;  disrupt  and  deny  valid  users  access;  and  steal, 
fabricate  or  distort  the  programs  and  data  resident  in  these  systems.  To  make  the  Global 
Information  Grid  (GIG)  a  reality,  DoD  must  follow  rigorous  discipline  in  systems 
engineering  principles  and  a  robust  risk  management  process,  translating  desired 
capabilities  into  detailed  requirements  and  specifications  that  drives  the  architecture  of 
the  global  network.  This  is  no  small  task.  The  complexity  and  expense  of  tying  together 
legacy  and  developing  systems  into  a  global  network  that  connect  with  diverse  state-of- 
the-art  communications  links  is  a  huge  endeavor.  The  network  is  populated  by 
tremendous  amounts  of  data,  which  interface  and  interoperate  with  many  systems  of 
varying  functionality.  The  network  is  intended  to  provide  the  right  information  at  all 
levels  of  command.  Security  requirements  drive  the  architecture  of  the  GIG  network  as 
well. 

This  thesis  investigates  possible  solutions  to  concerns  of  the  U.S.  Navy  leadership 
with  regard  to  network  enterprise  operations  and  security  protection  procedures,  focusing 
on  the  following  network  enterprise  risk  management  issues. 


•  The  considerations  in  developing  and  improving  network  enterprise  risk 
management  planning  implementation  of  the  processes  presently  used  by 
DoD  and  the  military  services  for  their  information  systems  and  networks 
on  the  global  domain  as  Net-Centric  Operations,  Net-Centric  Warfare,  and 
Information  Dominance. 

•  The  attributes  of  a  robust  network  enterprise  risk  management  program 
and  how  it  should  be  implemented  so  that  it  supports  the  confidentiality, 
availability,  integrity,  reliability,  and  trustworthiness  of  the  military’s 
critical  information  resources  so  they  act  as  an  enabler  to  mission  success. 

•  Some  of  the  architectural  implications  in  the  design  of  hardware 
(topology)  and  software  (network  management  and  control)  and  the 
process  imperatives  in  the  operation  of  networked  information  systems 
that  make  a  global  network  system  survivable  and  resilient  to  attack  from 
hostile  forces. 

•  Some  of  the  popular  risk  management  processes  in  the  public  domain  and 
how  their  methodology  might  enhance  DoD’s  network  enterprise  risk 
management  process  to  achieve  a  survivable  and  resilient  enterprise 
network  and  make  decisions  on  the  cosCbenefit  or  value  of  the  choices  in 
implementing  network  security  measures,  ensuring  quality  of  service  and 
information  assurance. 

NETWORK  ARCHITECTURE 

Networks  are  architected  in  different  arrangements  to  provide  different 
capabilities  while  efficiently  using  available  resources.  The  way  a  network  is  architected 
can  introduce  vulnerabilities  through  the  make-up  of  the  network  components,  by  the 
way  the  components  are  connected,  and  by  the  methods  employed  and  the  layer  of  the 
network  where  they  are  employed  to  provide  protection  for  network  and  information 
assets.  No  matter  how  elegant  the  architecture  of  a  network,  designed  to  provide  a  set  of 
services,  a  network  compromised  by  the  enemy  could  affect  operations  from  mission 
degradation  to  mission  failure;  from  the  top  level  of  the  chain  of  command  to  the  tactical 
units  in  the  field.  As  network  vulnerabilities  are  discovered  and  an  assessment  of  the  risks 
associated  with  those  vulnerabilities  is  conducted,  it  is  important  to  determine  what 
network  quality  of  service  attributes  that  U.S.  protection  services  and  countermeasures 
(technological,  procedural,  or  managerial)  seek  to  secure  or  enhance  through  the 
protection  of  data  and  program  resources,  and  how  these  attributes  can  be  secured 
through  innovative  system  design  and  network  architecting.  It  is  important  in  the 


XX 


development  and  operation  of  a  network  (as  well  as  any  system)  to  learn  how 
vulnerabilities  are  ereated  and  how  they  are  diseovered.  This  information  provides  system 
developers  with  valuable  experienee  to  draw  upon  when  developing  follow-on  systems 

The  topologieal  arrangement  of  a  network  is  the  hardware  arehiteeture,  and 
different  topologies  introduee  different  vulnerabilities.  Mesh  network  arehiteeture  offers 
the  network  superior  resilienee  from  attaek,  but  these  systems  are  costly  and  complex  to 
set  up.  The  software  network  management  of  a  mesh  network  can  be  quite  convoluted 
and  it  is  hard  to  monitor  the  effectiveness  of  the  management  and  security  of  the  network. 
However,  if  the  threat  of  attack  against  the  value  of  the  assets  in  this  type  of  network  is 
high,  the  cost  and  the  complexity  of  design  and  installation  may  be  worth  it.  The  GIG  is  a 
combination  of  many  architectural  topology  arrangements,  riding  on  the  backbone  of  the 
Defense  Information  Systems  Agency  (DISA)  intranet  bus.  For  this  reason,  a 
comprehensive  enterprise  risk  management  program  has  to  consider  the  GIG  architecture 
from  the  top  down  in  aggregation  and  from  the  bottom  up  as  each  type  of  topology  used 
in  interconnecting  networks  affects  the  enterprise  vulnerability  picture.  The  architectural 
framework  of  the  GIG  and  of  the  Navy’s  FORCENet  follows  the  International 
Standardization  Organization  (ISO)  layered  approach  but  in  three  main  layers.  It  is  the 
Navy’s  intent  that  information  assurance,  quality  of  service,  and  Human  Systems 
Integration  activities  penetrate  all  three  layers  of  the  reference  model.  A  plan  to  evaluate 
and  mitigate  risks  to  network  enterprises  must  consider  the  network’s  layer  abstractions. 
For  the  network’s  security  software  logic  and  mechanisms  to  operate  correctly,  it  is 
important  to  know  in  which  network  layer  abstraction  a  risk  mitigating  strategy  will  be 
defined  and  implemented. 

NETWORK  ENTERPRISE  RISK  MANAGEMENT 

That  adversaries  are  preparing  to  deny  U.S.  and  allied  forces  free  access  to 
information  supporting  superiority  in  military  operations  is  sound  judgment  supported  by 
much  evidence.  Along  with  possible  organized  (on  a  national  or  sovereign  scale)  efforts 
to  deny  the  U.S.  military  access  to  their  information,  many  independent  actors  exist  with 
an  agenda  and  motivations  not  necessarily  aligned  with  any  cause  who  relish  the 
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challenge  of  breaking  into  networks  containing  high  value  information;  not  the  least  of 
these  is  the  U.S.  military’s  network  systems.  For  this  reason,  DoD  has  mandated  that  a 
major  part  of  the  development  and  operation  of  information  system  networks  is  the 
requirement  for  a  robust  information  security  program,  primarily  centered  on  Information 
Assurance  (lA).  lA  is  chartered  to  develop,  test  and  implement  measures  to  protect 
networks  and  information  systems’  assets  while  at  the  same  time  meeting  the  sometimes- 
conflicting  objective  of  maintaining  maximum  network  accessibility  to  the  war  tighter 
who  needs  it;  many  times  under  hostile  conditions  or  harsh  environments. 

As  the  expertise  of  the  threat  in  being  able  to  “hack”  into  networks  continues  to 
grow  at  an  ever  increasing  pace,  the  costs  of  countering  that  threat  can  skyrocket  as  well. 
The  complexity  of  the  software  programs  and  architecture  designed  to  mitigate  the  risks 
of  today  and  to  anticipate  the  risks  of  tomorrow  brings  with  it  higher  costs  in  technology 
acquisition,  costs  of  training  operators  in  its  installation  and  operation,  the  engineering 
costs  of  design  and  testing  to  ensure  the  right  countermeasures  have  been  acquired  and 
applied,  and  the  costs  of  vigilance  in  monitoring  the  network  for  intrusion.  As  Bruce 
Schneider  said  in  CIO  magazine  concerning  network  security:  “I’m  here  to  tell  you  it’s 
not  about  the  teehnology”  (Schneider,  2001).  Since  it  is  increasingly  difficult  for 
countermeasure  technology  to  keep  up,  he  advocates  a  program  of  continuous  monitoring 
of  the  network’s  operation.  With  the  responsibilities  of  normal  watch  standing  duties,  the 
added  monitoring  of  network  operations  places  added  burden  on  manpower  costs,  both 
real  and  opportunity. 

To  be  able  to  mitigate  risk  to  the  operations  of  a  loeal  network  or  an  enterprise 
system  of  systems  and  achieve  operational  or  strategic  goals,  identified  risks  need  to  be 
assessed  to  make  the  decision  whether  it  is  worth  the  cost  in  funding  or  opportunity  to 
plan  and  implement  a  mitigation  strategy  for  that  risk.  The  answer  to  the  question  “What 
is  this  mitigation  strategy  protecting?”  has  a  direct  effect  on  the  mitigation  strategy 
employed.  For  risks  of  little  or  no  impact  no  matter  the  likelihood  of  occurrence,  the 
mitigation  strategy  may  be  one  of  accepting  the  risk  as  is.  Implementation  of  security 
requirements  that  restrict  functionality  of  a  network  incurs  monetary  as  well  as 
opportunity  costs,  life  cycle  costs,  and  some  hidden  or  latent  costs  (such  as  stakeholder 
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costs  in  the  future).  If  the  system  and  its  information  that  is  proteeted  have  little  impaet 
on  the  sueeess  of  achieving  the  desired  effeet,  it  might  be  prudent  to  realloeate  that 
funding  and  teehnieal  solution  elsewhere. 

Network  enterprise  risk  analysis  is  an  important  part  of  the  implementation  of  a 
network’s  seeurity  posture  as  it  overlaps  with  the  lA  program.  Network  enterprise  risk 
management  should  be  implemented  at  the  beginning  of  a  network  system’s  lifecyele. 
The  risk  analysis  and  management  proeess  follows  the  steps  of  general  risk  management 
proeesses  for  safety,  program,  operational,  and  enterprise  risks  in  organizations.  Unique 
to  network  risk  management  is  the  eoneentration  on  an  unpredietable  threat  who  is 
motivated  to  exploit  network  vulnerabilities  that  the  threat  diseovers  for  an  ultimate  goal 
of  gaining  something  of  value. 

Drawing  at  once  on  the  concepts  of  game  theory  and  fault  trees  used  in  reliability 
analysis,  attaek  trees  ean  be  useful  in  identifying  and  analyzing  network  vulnerabilities 
and  the  paths  that  ean  be  exploited  to  gain  aeeess  to  the  assets  of  the  network.  At  the 
same  time,  they  are  useful  in  gaming  the  attributes  of  a  threat  that  would  make  the  threat 
more  or  less  likely  to  make  an  attaek  on  the  network. 

The  risk  management  plan  must  take  into  aeeount  not  only  the  threats,  system 
vulnerabilities,  impacts  and  mitigation  implementation  plans;  but  it  must  go  one  step 
beyond  to  determine  how  to  arehiteet  the  system  for  survivability  and  system 
reeoverability.  The  network  must  be  designed  from  well-eonsidered  requirements  to 
resist,  reeognize  and  reeover  from  an  attaek.  The  risk  management  plan  must  be  mission- 
oriented.  It  must  also  be  balanced,  considering  the  costs  of  various  risk  mitigation  and 
survivability  design  ehoiees  in  terms  of  aequisition  resourees  and  in  the  effeets  of  those 
choiees  on  end-user  funetionality.  The  deeisions  made  as  a  result  of  the  risk  mitigation 
and  survivability  planning  and  implementation  drive  the  network  arehitecture  and  design, 
and  properly  executed,  result  in  an  interoperable  and  networked  system  of  systems  and 
family  of  systems  providing  the  war  fighter  with  the  right  information  at  the  right  time, 
and  easily  operated  programs  and  applieation  to  put  ordnance  on  target  and  to  keep  the 
enemies’  ordnanee  off  us. 


NETWORK  SURVIVABILITY  AND  RESILIENCY 


While  risk  analysis  and  management  are  designed  to  find  and  fix  vulnerabilities 
that  put  the  network  at  risk  by  the  threat  that  exploits  them  with  the  intent  to  gain  aecess 
to  valuable  information  system  assets,  survivability  is  the  attribute  of  a  system  that 
defines  how  it  deals  with  an  actual  exploitation  of  network  vulnerabilities  that  have 
remained  after  mitigation  implementation.  It  is  the  architecting  of  a  system  before  attack 
to  respond  to  attack  after  other  risk  mitigation  implemented  plans  have  been  activated  to 
resist  attack  by  mitigation  plans  that  have  reduced  network  vulnerability.  In  other  words, 
survivability  and  resiliency  are  defense  in  depth  for  a  network  by  designing  the  capability 
to  continue  action  to  resist  and  recover  after  an  attack  scenario.  While  it  is  vitally 
important  to  manage  the  risk  to  a  network  before  attack  and  to  make  every  effort  to  keep 
it  from  happening,  a  further  defense  mechanism  and  process  needs  to  be  in  place  in  the 
event  of  an  attack. 

Survivability  is  scenario-driven,  and  defining  survivability  requirements  with 
which  to  build  a  survivable  network  system  is  challenging.  For  this  reason,  the  architect 
must  look  at  the  network’s  boundaries,  the  interface  to  other  networks,  and  define  where 
the  line  is  drawn  to  resist  attacks  from  threats  that  come  in  various  ways  with  an  array  of 
capabilities.  Once  an  attacker  has  penetrated  a  network,  the  architect  must  look  at  the 
capability  of  the  system  to  adapt  and  recover  while  stopping  the  attacker’s  progression. 
Adaptability  has  to  be  built  into  a  system  on  initial  design.  Unlike  manned  systems  that 
can  adapt  with  human  intervention,  network  systems  require  adaptation  in  fractions  of 
seconds,  through  complex  software  logic  and  must  be  able  to  do  this  automatically. 

CONCLUSIONS 

This  thesis,  (I)  examines  a  network’s  architecture  from  the  hardware  aspect 
(topology)  and  software  (layers  and  Service  Oriented  Architecture  (SOA))  and  how 
certain  architectures  create  or  mitigate  vulnerabilities  that  could  be  exploited  by  threats. 
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and  (2)  develops  a  risk  management  proeess  to  enhance  DoD  Net-centric  operations  and 
the  GIG  architectural  framework.  The  result  is  a  comprehensive  network  enterprise  risk 
management  plan  with  the  flexibility  to  adapt  to  a  changing  environment. 

The  second  contention  of  the  thesis  is  that  the  network  enterprise  needs  to  be 
architected  with  survivability  and  resiliency.  A  solid  network  risk  management  plan  can 
inform  the  architects  and  engineers  where  a  network’s  vulnerabilities  exist  so  that 
survivability  and  resilience  can  be  built  into  a  network  system  designed  to  provide  critical 
services  in  the  face  of  an  attack  on  the  network. 

There  are  risks  associated  with  every  endeavor.  In  the  quest  to  develop  an 
interoperable,  interactive,  and  collaborative  network  enterprise  across  DoD  that  achieves 
DoD’s  strategic  goal  of  information  dominance  against  the  adversary,  a  key  ingredient  in 
the  success  of  that  goal  is  to  identify  the  risks  to  the  network  and  develop  a  plan  through 
knowledgeable  assessment  to  mitigate  the  risks  to  an  acceptable  level.  Doing  this 
contributes  greatly  in  allowing  the  network  to  provide  the  war  fighters  the  information 
and  capability  they  require  to  gain  the  edge  in  situational  awareness,  no  matter  the  size  or 
characterization  of  the  mission.  To  shy  away  from  the  possibility  of  network  attack  by 
initiating  uninformed  security  measures  which  unduly  inhibit  the  network  enterprise 
functionality,  or  to  ignore  the  risk  in  an  effort  to  meet  budget  targets,  assures  a  less  than 
satisfactory  capability  and  acts  as  an  impediment  to  reaching  the  ultimate  goal  of 
information  dominance.  Risk  to  the  network  enterprise  must  be  dealt  with  up  front  by 
first  designing  the  network  for  resiliency,  second  by  constant  vigilance  to  the  changing 
environment  at  the  boundaries/interfaces  of  the  network,  and  third  by  building  in  an 
adaptability  that  learns  from  the  attempted  attacks  as  well  as  the  successful  ones, 
strengthening  the  network  in  every  iteration.  Proper  employment  of  a  rigorous  network 
enterprise  risk  management  plan  supported  by  leadership  delivers  a  network  enterprise 
system  that  can  deliver  the  goods  of  accurate,  uncompromised,  and  available  information 
when  it  is  most  needed. 
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I.  INTRODUCTION 


John  G.  Grimes,  the  Department  of  Defense  Chief  Information  Officer  states: 

The  security  challenges  of  the  2D‘  century  are  characterized  by  change 
and  uncertainty.  Operations  vary  widely  and  partners  cannot  be 
anticipated.  However,  we  are  confronting  that  uncertainty  by  becoming 
more  agile.  Greater  levels  of  agility  rest  upon  leveraging  the  power  of 
information-the  centerpiece  of  today’s  Defense  transformation  to  net- 
centric  operations.  (NCO).  (DoD  CIO,  2007,  p.  1) 

A.  BACKGROUND 

1.  Information  Technology  Evolution 

The  information  age  brings  new  capability  to  the  military  by  introducing  software 
into  the  design  of  systems  to  run  computers  that  provided  control,  automation  and  data 
manipulation,  making  these  systems  more  capable.  It  is  estimated  that  the  contribution  of 
software  into  the  engineering  effort  of  system  design  over  the  last  decade  has  increased 
30  to  70  percent  (Maier  &  Rectin,  2002).  Communications  technologies  benefit  from  the 
use  of  software  by  allowing  the  transmission  of  information  in  new  forms;  increasing  the 
speed  of  delivery  and  the  quantity  of  the  data  transmitted.  Systems  are  tied  together  into 
networks  so  that  data  and  programs  can  be  shared  between  similar  systems  and  across 
geographical  boundaries  to  enhance  the  capability  of  the  system  and  the  war  fighters  who 
use  those  systems.  However,  this  technology  is  applied  in  eclectic  fashion  to  new  systems 
development,  so  that  when  the  systems  are  brought  into  production,  they  are  equipped 
with  the  software  available  at  the  time  they  are  developed.  Out  of  this,  systems  and 
systems  of  systems  are  developed  with  differing  characteristics  (network  architecture  of 
hardware,  software,  and  firmware,  operating  systems,  applications,  connectivity 
protocols,  and  use  of  the  electro-magnetic  spectrum)  so  that  presently  there  are  thousands 
of  different  programs  running  on  different  networks  satisfying  the  requirements  of  a 
particularly  stove-piped  war  fighting  capability.  In  addition,  the  technology  of 
computational  power  and  networking  is  applied  to  the  military’s  business  enterprise  and 
to  the  collection  and  dissemination  of  intelligence  data. 
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2.  Information  Technology  Today 

As  information  systems  in  the  military  grow  in  importance  and  capability,  the 
services  and  DoD  attempt  to  control  the  characterization  of  the  systems’  software  and 
how  systems  operate  through  limited  configuration  control  using  standards  like  the 
Defense  Information  Infrastructure  Common  Operating  Environment  (DII/COE).  This 
standard  is  designed  to  achieve  a  commonality  between  software  components  designed 
into  systems  to  allow  some  form  of  configuration  control  and  to  allow  interoperability 
between  systems  (Stewart  2006).  However,  the  present  information  systems  architecture 
in  the  Navy,  and  in  DoD  in  general,  is  comprised  of  multiple  networks  serving  stove- 
piped  applications,  which  are  further  partitioned  by  functional  category  (business 
enterprise,  combat  systems.  Command,  Control,  Communications,  Computers,  & 
Intelligence  (C4I),  logistics/supply  and  specialized  intelligence  gathering  to  name  a  few). 
The  military  recognizes  now  that  information  and  easy  access  to  that  information  is  a  key 
element  in  gaining  an  advantage  over  the  enemy,  and  that  the  next  frontier  in  warfare  is 
the  ability  to  leverage  information  system  interoperability  and  to  quickly  turn  data  and 
information  into  knowledge  superiority  that  results  in  the  advantage  over  an  adversary. 
Superior  firepower  is  one  thing,  but  the  knowledge  of  when  and  where  to  apply  it  is 
another.  To  accomplish  this  mission  and  to  achieve  the  capability  of  knowledge 
superiority,  the  military  needs  architecture  for  an  integrated  and  interoperable 
information  system  of  systems. 

3.  Future  Vision  of  Information  Technology 

As  the  quote  at  the  beginning  of  this  section  states,  the  leaders  of  DoD  and  the 
military  services  have  clearly  articulated  their  vision  of  information  operations  and  the 
future  strategy  for  military  operations  based  on  knowledge  superiority.  Computers  and 
the  automated  processing  capability  inherent  in  them  used  to  be  seen  as  one  of  the 
technology  tools  in  a  tool  set  to  increase  the  effectiveness  of  major  weapons  platforms 
and  to  increase  capability  in  other  aspects  of  operating,  training,  equipping,  and 
communicating  in  the  different  warfare  areas  (NSA/CSS/GIG,  2008).  Today,  that  vision 
has  expanded  to  include  information  technology,  processing  and  networking  power  into  a 
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warfare  area  of  its  own,  Cyber-warfare.  Information  and  its  colleetion,  manipulation, 
distribution,  use,  and  protection  are  considered  vital  to  the  future  of  warfare  and  are 
central  to  the  defense  strategy  of  the  nation.  Information  systems  are  no  longer 
considered  to  be  stand-alone.  They  must  be  networked  and  interoperable;  capable  of 
sharing  information  with  all  authorized  personnel  and  entities.  Warfare  and  any  military 
operation  are  now  mandated  to  be  net-centric. 

To  comply  with  the  Secretary  of  Defense,  the  Chairman  of  the  Joint  Chiefs  and 
the  service  Chiefs’  vision,  the  DoD’s  Chief  Information  Officer  formulated  an 
architecture  for  the  GIG  and  a  pathway  to  take  the  military  from  the  present  day  GIG  to 
what  he  calls  the  Target  GIG;  a  system  of  information  capabilities  gained  through 
procedure  and  technology  including  doctrine,  organization,  training,  material,  leadership 
and  education,  personnel,  and  facilities  (DOTMLPF)  that  provides  an  agile,  dynamic, 
interoperable,  and  responsive  system.  Today’s  grid  is  sporadically  networked  along 
stove-piped  structures  of  services,  warfare  areas,  special  capabilities,  and  partitioned 
organizations.  While  the  networks  are  not  an  exact  reflection  of  the  chain  of  command 
under  which  they  are  governed,  they  do  have  somewhat  of  a  hierarchical  architecture  to 
them.  Each  networked  system  uses  its  own  technology  base  and  is  run  in  accordance  with 
local  procedures  for  the  most  part.  There  are  over-arching  rules  and  procedures 
emanating  from  organizations  such  as  the  Defense  Information  Systems  Agency  (DISA), 
National  Security  Agency  (NSA),  and  individual  service  controllers  as  with  Naval 
Network  Warfare  Command  (NAVNETWARCOM). 

4,  Architectural  Vision  for  Networked  Information  Systems 

The  target  GIG  architecture  is  designed  to  allow  the  users  “to  find  and  share  the 
information  they  need,  when  they  need  it,  in  a  form  they  can  understand,  use,  and  act  on 
with  confidence;  and  protects  information  from  those  who  should  not  have  it”  (DoD  CIO, 
2007,  p  .7).  The  GIG  technology  is  based  on  a  SOA  of  loosely  coupled  repositories  of 
services  accessible  to  any  node  on  the  network  that  has  access  rights  (what  the  CIO  calls 
need  to  share).  The  technology  draws  from  commercial  technologies  already  developed 
to  architect  a  system  using  Open  Architecture,  allowing  a  cost-effective  way  to  design 
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and  operate  the  systems  throughout  their  life  cyele  and  eapitalizing  on  the  reuse  of 
software  and  firmware  components.  This  architecture  is  the  key  enabler  of  Net-Centric 
Operations  (NCO)  and  Net-Centric  Warfare  (NCW).  Figure  1  shows  a  model  of  the 
architecture. 
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The  GIG  Federated  Architectural  Framework  is  the  structure  that 
ties  the  disparate  architectures  of  the  services  together  as  they 
exist  today. 

Figure  1.  GIG  Federated  Architecture  (From;  DoD  CIO,  2007) 

The  ultimate  goal  of  the  GIG  architecture  is  to  move  from  a  federated  to 
enterprise  architecture.  To  be  federated  means  that  individual  programs  are  networked 
together  through  a  tightly  coupled  framework.  Enterprise,  through  the  application  of 
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SOA,  means  the  systems  are  loosely  coupled  services  networked  on  an  architecture, 
which  is  agile  and  employs  collaboration  as  its  main  ingredient  of  communication  (DoD 
CIO,  2007). 

To  achieve  the  vision  of  the  DoD’s  target  GIG,  each  service  has  an  information 
technology  initiative  that  falls  in  line  with  the  architectural  framework  of  the  GIG, 
employing  the  principles  of  a  distributed  system  built  on  a  SOA.  Standards  are  based  on 
commercial  standards  boards  as  in  Institute  of  Electrical  and  Electronic  Engineers 
(IEEE),  Internet  Engineering  Task  Eorce  (lETE),  International  Standardization 
Organization  (ISO),  American  National  Standards  Institute  (ANSI).  Additional  standards 
stem  from  service-specific  initiatives  including  EORCENet  for  the  Navy  and  Marine 
Corps,  the  Air  Eorce  Challenger  program,  EandWarNet  as  part  of  the  Euture  Combat 
Systems  for  the  Army,  and  Deep  Water  for  the  Coast  Guard. 

5.  Challenges 

a.  Architecture 

Many  challenges  face  the  achievement  of  a  net-centric  system  of  systems 
throughout  DoD  built  on  the  target  GIG  architectural  framework.  The  sheer  size  of  the 
GIG  technologically,  financially,  and  procedurally  is  daunting  (NSA/CSS/GIG  2008). 
Due  to  its  sheer  size  and  number  of  disparate  networks  linked  together,  security  planning, 
implementation  and  coordination  of  effort  across  the  services  are  difficult.  Another 
obvious  challenge  is  the  coordination  between  the  services  as  the  development  of  a 
unified  network  architected  in  the  GIG  framework  evolves.  The  service  components  may 
not  be  ready  and  willing  to  connect  their  individual  network  systems  in  synchronization 
with  the  target  GIG  objectives.  Each  service  has  individual  goals  they  are  trying  to 
achieve,  but  for  the  GIG  to  be  truly  interoperable,  there  must  be  a  consolidated  set  of 
objectives  and  a  common  approach  to  development  controlled  and  monitored  by  the  DoD 
agencies  responsible  for  overseeing  the  development  of  the  GIG.  As  can  be  seen  in 
Eigure  2,  the  set  of  services  that  reside  in  the  GIG  architecture  are  meant  to  be  all- 
inclusive  from  all  service  components,  tied  together  by  SOA  The  interoperability  of  the 
core  services  meets  the  CIO’s  goal  of  information  sharing.  Choosing  a  SOA  is  considered 
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to  be  a  way  of  mitigating  the  risks  assoeiated  with  the  intereonnectivity  of  the  complete 
set  of  services.  However,  the  loose  coupling  of  services  through  SOA  allows  connectivity 
and  collaboration  while  keeping  individual  network  interfaces  less  exposed  to  the 
vulnerabilities  of  those  networks  to  which  they  are  connected.  Some  disadvantages  to  this 
form  of  architecting  core  services  connectivity  are  discussed  in  Chapter  II. 


Other 

Services 


Service 

Registries 


Metadata 
Catalogs  and 
Registries 
Mediation 


Service 

Service  Orchestratior>  Service 
Goverr)ar)ce  Mar>agement 


Policy 

Services 


soA 


hire 


Service 

Discovery 


Directory 

Services 

Bridge 

Services 


Net  Enabled 
Command  &  Control 

Close  Air  Support 
Net  Enabled 
Targeting 

Warfighting 
Applications 


Services 


Service 

Communication 
{ Messaging) 


SATCOM 

Track  Data 
Services 
ISR 


Information 

Assurance 

Services 


Service 
Security 

Core 

Enterprise 

Services 


Service  Collaboration 

Policies  Services 

EE 

Services 

Discovery 

Services 


Logistics 

Financial 

Management 


Human 

Resources 

Management 


Business 

Applications 


Imagery 

Exploitation 


NetOps 


Sensor  Net 

Geo-Spatial 


Defense 

Intelligence 

Applications 


Applications, 
Services,  and 
Information 


SOA  is  a  framework  for  achieving  the  CIO  goal  of  information  sharing 
between  core  services  with  an  intereonnectivity  that  promotes  security. 

Figure  2.  GIG  core  Services  and  Underlying  SOA  Infrastructure  (From;  DoD  CIO, 
2007) 

b.  GAO  Concerns 

In  a  2004  report,  the  U.S.  Government  Accountability  Office  (GAO) 
examined  the  process  of  GIG  development  and  uncovered  areas  of  concern  (USGAO, 
2004).  GAO  found  the  following. 
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•  Identification  and  prioritization  of  technology  investments  was  not  yet 
articulated.  With  a  system  this  large  and  complex,  the  financial  decisions 
and  the  acquisition  strategy  have  huge  ramifications  to  the  end  product’s 
ability  to  perform  as  desired.  In  addition,  it  is  not  clear  how  GIG 
technology  investments  impact  other  programs  resources  for  development. 

•  There  was  not  a  clear  understanding  of  how  or  who  would  enforce 
standards  during  development. 

•  Planning  was  lacking  on  how  to  deal  with  advancements  in  technology 
and  how  they  would  be  incorporated  (or  not).  Especially  important  here  is 
what  technology  path  to  pursue  as  parallel  technologies  in  hardware, 
software,  protocols,  applications  and  methods  are  developed.  Which  one 
would  DoD  choose  to  apply  to  the  GIG?  Since  the  GIG  is  based  on  open 
architecture,  commercial  products  are  an  integral  part  of  the  system,  and 
choosing  the  technology  that  prevails  is  important  to  the  lifecycle  costs 
and  the  development  of  other  functionalities  that  depend  on  that  product 
choice.  Recall  the  DoD’s  choice  of  the  Sony  Betamax  as  the  video  cassette 
recorder  for  shipboard  entertainment  systems. 

•  A  system  this  large  makes  it  is  difficult  to  evaluate  the  degree  of 
enhancement  to  war  fighting  capability  the  GIG  offers.  In  addition,  the 
development  of  this  capability  takes  some  time,  and  as  the  national 
strategy  and  the  environment  change  over  time,  a  question  of  the  ability  to 
evaluate  the  new  capabilities  in  light  of  new  threats  has  not  been 
articulated  in  the  architectural  framework. 

•  Network  bandwidth  has  been  a  challenge  to  individual  networks,  and  even 
with  consolidation  of  some  commercial  services  operating  bands,  it  is 
going  to  continue  to  be  a  challenge  as  the  size  and  the  requirements  for 
connectivity  grow  in  the  GIG.  Through  the  GIG-BE  (bandwidth 
expansion)  program,  the  vision  of  the  GIG  is  that  it  is  agile  to  allocating 
bandwidth  to  the  right  entity  that  needs  it  in  a  temporal  sense.  However, 
the  unknown  is  how  much  bandwidth  the  entire  system  requires  and  how  it 
is  obtained  given  the  competing  interests  of  national  and  international 
commercial  enterprises  and  other  Government  organizations  (such  as  the 
Department  of  Homeland  Security). 

•  Protection  of  data  within  the  current  systems  as  well  as  the  data  generated 
during  the  development  of  the  GIG  has  not  been  given  the  attention 
required.  Eor  instance,  in  the  Core  Enterprise  Services  layer  of  the  GIG, 
“...Parts  of  the  computing  infrastructure  are  operated  and  maintained  by 
commercial  or  government  computing  service  providers  (CSP)  that 
provide  managed  services  for  hosting  and  maintaining  enterprise  services 
and  applications...”  (DoD  CIO,  2007,  p.  21).  The  GAO  asked  how  the 
GIG  developers  assure  current  system  owners  (legacy  and  component 
owners)  that  their  data  are  secure  and  remain  so  given  the  objective  of  the 
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GIG  to  broaden  the  sharing  of  information  even  with  coalition  partners 
who  are  not  yet  identified.  Protection  not  only  goes  to  the  lA  technology, 
but  also  to  the  procedures  for  safeguarding  data,  including  the  following. 

•  Who  owns  the  data? 

•  Who  has  authority  to  release  data? 

•  What  is  the  plan  if  data  is  inadvertently  or  maliciously  released  to 
organizations  or  countries  DoD  to  which  it  does  not  want  it 
released? 

•  What  is  the  impact  on  mission  accomplishment? 

•  How  is  the  impact  determined? 

•  How  to  recover  from  that  data  being  in  the  wrong  hands. 

c.  Space  and  Naval  Warfare  Command  Concerns 

Other  concerns  and  challenges  in  the  development  of  a  Net-Centric 
Warfare  capability  with  a  robust,  agile  and  interoperable  network  have  been  expressed  by 
service  organizations,  such  as  the  Space  and  Naval  Warfare  (SPAWAR)  Command  and 
their  systems  centers  responsible  for  the  acquisition  and  development  of  Command, 
Control,  Computer,  and  Communications,  and  Intelligence,  Surveillance,  and 
Reconnaissance  systems  (C4ISR)  (Davis  2008;  Anderson,  Davis,  &  Green,  2008).  Some 
of  their  concerns  are  the  following. 

•  The  data  protection  policy  and  a  security  architecture  to  protect  data-a 
scheme  for  the  prioritization  of  protection  levels  for  data  and  how  that 
affects  the  Multi-level  security  and  cross  domain  solutions  to  the  sharing 
of  data.  One  plan  for  the  protection  of  data  is  what  SPAWAR  calls  a  Data- 
centric  Security  Approach-prioritizing,  partitioning,  temporal  value 
determination,  data  ownership,  levels  and  need  to  access,  storage  and 
back-up  requirement-are  just  a  few  of  the  elements  of  this  plan. 

•  Establishing  protections  and  procedures  for  the  supply  chain 
management  of  computer  network  components.  This  is  especially 
important  in  the  future  development  of  Net-centric  Operations,  which  are 
built  on  the  concept  of  Open  Architecture  and  the  procurement  of 
commercially  developed  and  manufactured  items.  There  needs  to  be  a  plan 
for  how  to  ensure  the  level  of  quality  of  the  procurement. 

•  Configuration  management  is  important  to  the  security  of  the  networked 
systems  to  know  what  is  running  in  each  level  of  the  architecture,  and  the 
plans  for  protection  and  recovery  are  valid  for  the  known  configuration. 
One  of  the  challenges  with  service  oriented  architecture  is  the  loose 
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coupling  of  the  various  services.  Service  providers  have  to  have  some 
standards  of  eonfiguration  control  to  assure  the  data  users  of  the  integrity 
of  the  serviee  being  queried  by  a  user. 

•  The  GIG  Information  Assurance  (lA)  infrastructure  is  built  on  five  key 
elements  listed  below;  however,  net-eentric  operations  and  warfare  need 
an  information  assuranee  poliey  and  procedures  for  balaneing  the  needs  of 
the  users  with  the  levels  of  protection  to  meet  user  needs.  In  other  words, 
maintaining  a  level  of  proteetion,  whieh  is  affordable,  enforeeable,  non- 
intrusive  enough  for  mission  aecomplishment  but  strong  enough  to  ensure 
mission  aecomplishment  does  not  compromise  degradation  of  the  loss  of 
the  network  or  some  of  its  components.  Currently,  the  GIG  lA 
infrastrueture  is  defined  by  the  following. 

•  Transactional  information  protection 

•  Distributed  and  automated  digital  policy  enforcement 

•  Defense  against  internal  adversaries 

•  Integrated  seeurity  management,  and 

•  Embedded  lA  within  enterprise  components  for  a  net-centrie  trust 
model 

•  For  the  current  state  of  networked  systems,  whether  interoperable, 
federated  or  stand  alone,  and  for  the  target  GIG  and  its  components,  the 
DoD  network  enterprise  risk  management  plan  should  be  robust, 
holistie,  and  struetured,  but  should  also  be  understandable  and  able  to  be 
applied  at  every  aggregation  and  layer  of  the  network  system. 

6,  Service  Networks  Integration  into  the  GIG 

FORCENet  is  an  example  of  how  the  services’  architecture  leads  to  the 
development  of  the  target  GIG  when  incorporated  into  the  GIG  system.  Aligned  with  the 
architeetural  principles  of  the  GIG,  the  Navy  sees  its  FORCENet  system  as  the 
integration  of  networks  and  communication  with  capabilities  that  are  distributed  and 
agile,  able  to  make  changes  to  configuration  “on  the  fly.”  Network  configuration 
adaptability  allows  a  tailored  network  strueture  to  be  eomposed  as  needs  are  realized  to 
meet  the  mission  requirements  in  a  distributed  environment.  The  security  mechanisms 
and  Information  Assurance  program  of  FORCENet  are  envisioned  to  deliver  information 
to  the  warrior  that  assures  eonfidentiality,  trust,  integrity,  availability,  authentieation  and 
non-repudiation  (inability  to  deny  that  correet  information  was  received)  (Stewart,  2006). 
FORCENet  uses  eommereial  standards  in  an  open  architecture  philosophy,  and  uses  the 
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same  development  principles  as  the  GIG  through  the  spiral  model.  This  methodology  is 
designed  to  be  warrior-focused.  Commercial  standards  and  open  architecture  are  used  in 
the  development  of  FORCENet  to  bring  commonality  to  the  systems,  which  comprise 
FORCENet  and  to  use  standards  already  in  place  that  are  working  on  commercial 
systems,  avoiding  the  necessity  to  generate  new  standards  through  the  development  of 
proprietary  software  that  is  FORCENet  specific.  The  purpose  of  designing  FORCENet  in 
a  spiral  development  model  is  to  transition  stove-piped  networks  and  programs  from  their 
programs  of  record  to  a  net-centric  configuration.  The  objective  of  the  warrior- focused 
approach  is  to  ensure  the  paradigm  of  “sensor  to  shooter;”  the  ability  to  get  the  complete 
picture  prior  to  making  a  decision  to  engage  (Might,  2004). 

Communication  between  services  is  based  on  the  system  presently  built  into  the 
World  Wide  Web,  the  use  of  Extensible  Markup  Eanguage  (XME)  and  the  Extensible 
Tactical  C4I  Framework  (XTCF).  Figure  3  shows  a  pictorial  representation  of  the 
communications  infrastructure  as  envisioned  in  the  GIG. 
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The  GIG  communications  architecture  is  based  more  on  wireless 
connectivity  than  the  Internet  due  to  the  military’s  inherent  mobility 

Figure  3.  Communications  Infrastructure  of  the  GIG  (From;  DoD  CIO,  2007) 
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Communications  technology  from  a  physical  reference  entails  several  different 
modes,  but  the  infrastructure  is  principally  connected  by  fiber-optie  cable  and  wireless 
teehnologies  utilizing  airborne  and  satellite  relays.  This  setup  of  the  infrastrueture  is  not 
new  for  the  military,  but  what  is  new  is  the  arehiteeture  to  eombine  these 
eommunieations  means  into  a  eohesive  network,  whieh  delivers  aeeurate,  timely,  and 
suffieient  information  to  meet  the  ultimate  objeetives  of  military  strategy. 

B,  PURPOSE 

1.  After  the  Fact  Risk  Management  Diminishes  Network  Capability 

The  purpose  of  this  thesis  is  to  investigate  all  aspects  of  a  suitable  network 
enterprise  risk  management  plan  in  a  GIG-like  environment.  Speeifieally,  it  provides  the 
following. 

•  An  examination  of  various  network  arehiteetures  and  the  advantages  and 
disadvantages  of  different  arrangements  with  regard  to  their  ability  to 
resist  and  reeover  from  network  intrusion  with  loss  of  eonfidentiality, 
integrity  or  availability  of  data,  and  funetionality 

•  Risk  and  the  risk  management  proeess;  in  partieular,  how  operational  and 
strategic  risk  management  can  be  applied  to  operational  and  strategie 
global  networked  systems  including  the  GIG  and  the  serviees’  network 
operations  and  to  the  mission  sueeess  of  a  large  organization  sueh  as  the 
U.S.  military. 

•  Speeifie  risk  management  proeess  models  in  use  today  with  varying 
degrees  of  sueeess  by  business  and  governmental  organizations  and  how 
they  ean  be  applied  to  a  risk  management  plan  for  NCO/NCW 

•  The  definition  of  a  survivable/resilient  system  and  the  neeessity  to  inelude 
the  attributes  of  survivability  and  resilieney  into  network  designs  and  in 
the  development  of  a  risk  management  proeess/program. 

•  Network  attributes  that  promote  the  proteetion  of  eritieal  network 
resourees  from  the  disruption  of  network  operations  or  the  eompromise  of 
eritieal  data.  These  attributes  inelude  defense  in  depth,  fault  toleranee, 
diversity  and  distribution,  and  redundancy  and  replication.  As  the  Chief  of 
Naval  Operations  has  designated  all  his  networks  as  eritieal,  finding  the 
attributes  that  work  the  best  for  the  given  network  is  a  top  priority. 
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•  Bridge  the  gap  between  network  risk  management  and  arehiteeting  a 
network  to  meet  the  duel  objectives  of  capability  and  security,  with  the 
ultimate  goal  of  a  global  network  of  Joint  military  capabilities  that  can 
recover  quickly  in  any  environment  or  theater  of  operations  and  meet  the 
war  fighters’  needs. 

Why  is  the  study  and  analysis  of  a  network  enterprise  risk  management  plan 
vitally  important?  The  complexity  of  the  GIG  and  the  services  networks-FORCENet  for 
one-make  them  vulnerable  to  attack  on  a  number  of  fronts.  If  the  networks  provide  the 
war  fighter  the  information  needed  to  win  the  battle,  it  is  important  to  know  where  these 
vulnerabilities  are  and  how  to  mitigate  them  so  the  vital  information  keeps  flowing  to  the 
right  places  and  individuals.  It  is  postulated  here  that  this  risk  management  is  best 
developed  from  the  top  down  through  the  network  enterprise  operations  to  the  nodes  on 
that  network  and  the  services  they  provide  and  the  users  so  that  the  plan  is  effectual  end- 
to-end. 

Risk  can  be  dealt  with  in  a  number  of  different  ways.  Generally,  risk  can  be 

eliminated,  mitigated,  transferred  or  avoided.  While  the  lA  engineers  from  NSA,  DISA, 

and  the  services’  systems  commands  have  implemented  many  technical  and  process- 

oriented  protections  to  the  military’s  vast  array  of  stove-piped  and  legacy  networks,  it  is  a 

common  practice  as  prescribed  by  network  operating  procedure  to  avoid  risk  by  isolating 

and  terminating  network  operations  that  have  been  attacked,  no  matter  what  the  level  of 

the  attack,  the  level  of  interruption  to  services,  or  amount  of  destruction  to  network  data 

or  programs.  Risk  management  of  the  military’s  networks  today  is  to  a  great  extent 

comprised  of  information  assurance  efforts  to  examine  and  implement  the  best  lA  and 

protection  technology  tools,  procedures  and  controls  (countermeasures)  to  limit  the 

possibility  of  intrusion  from  without  and  within  that  would  cause  degradation  to  the 

network  or  one  of  its  nodes.  Risk  management  planning  appears  to  be  somewhat 

reactionary,  trying  to  plug  the  holes  of  vulnerability  with  technology  or  operational 

restrictions,  or  reorganization  of  resources  and  controls  against  known  threats,  and 

hoping  that  the  technology  guards  against  the  unknown  threat.  Not  much  evidence  exists 

that  an  examination  of  how  the  network  countermeasures  to  resist  attack  are  analyzed  for 

their  effects  on  network  operations  and  the  ability  to  bring  information  to  the  war  fighter. 

In  fact,  the  standard  operating  procedure  appears  to  be  a  form  of  risk  avoidance  when  an 
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intrusion  is  detected  or  a  fault  or  failure  is  realized.  The  procedure  is  usually  to  “turn  it 
off’  as  soon  as  a  problem,  real  or  imagined,  is  found.  Growing  concern  exists  among  the 
war  fighters  about  what  might  happen  in  the  middle  of  a  critical  operation  where  the 
advantage  in  the  fight  is  information  and  knowledge  superiority,  and  that  when  a  fault  is 
detected  in  the  network  system,  network  system  administrators  and  network  management 
disable  the  network  and  deny  the  advantages  of  connectivity  and  information 
accessibility  to  the  war  fighter  until  the  fault  is  located  and  fixed. 

2,  Summary  of  Purpose 

This  thesis  investigates  network  enterprise  risk  management  and  determines  what 
some  of  the  likely  risks  are  in  operating  a  network  as  complex  as  the  GIG  and  identify 
architectural  tradeoffs  available  to  improve  network  connectivity,  functionality  and 
security.  DoD  and  services’  risk  management  plans  in  place  today  can  be  improved 
through  the  adoption  of  processes  examining  architecting  a  network  for  the  following. 

•  Optimizing  the  often  competing  objectives  of  functionality  and 
information  protection 

•  Designed  for  survivability  and  resiliency  to  allow  continued  network 
connectivity  even  if  limited 

Principally,  the  network  enterprise  risk  management  plan  can  be  enhanced  to 
guide  the  management  of  network  services  and  the  formulation  of  policies  and 
procedures  that  supports  the  war  fighters  in  accomplishing  their  mission. 

C.  RESEARCH  QUESTIONS 

This  thesis  attempts  to  answer  or  provide  some  evidence  to  respond  to  the 
following  questions  concerning  network  enterprise  risk  management  of  the  U.S. 
military’s  network  systems.  The  term  networks  means  the  “global  domain  within  the 
information  environment  consisting  of  the  interdependent  network  of  information 
technology  infrastructures,  including  the  internet,  telecommunications  networks, 
computer  systems,  imbedded  processors  and  controllers”  (Davis,  2008,  p.  3). 
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•  What  considerations  are  important  in  developing  and  improving  risk 
management  planning  and  processes  presently  used  by  DoD  and  the 
military  services  for  their  information  systems  and  networks  on  the  global 
domain  as  Net-Centric  Operations,  Net-Centrie  Warfare,  and  Information 
Dominanee  now  in  the  center  of  the  nation’s  defense  strategy? 

•  When  and  how  should  a  robust  network  enterprise  risk  management 
program  be  implemented  that  supports  the  eonfidentiality,  availability, 
integrity,  reliability,  and  trustworthiness  of  the  military’s  critical 
information  resources  and  also  acts  as  an  enabler  of  mission  success  in  the 
operational,  strategic,  and  business  proeesses  domains? 

•  What  are  some  of  the  arehiteetural  implications  in  the  design  and  the 
proeess  imperatives  in  the  operation  of  networked  information  systems 
that  make  a  global  network  system  survivable  and  resilient  to  attack  from 
hostile  forees? 

•  Are  there  network  risk  management  proeesses  already  in  existenee  in  the 
public  domain  supporting  DoD’s  network  enterprise  risk  management 
proeess  to  achieve  a  survivable  and  resilient  enterprise  network  and 
support  DoD  in  making  risk  assessments  and  decisions  on  the  cost/benefit 
or  value  of  the  choices  in  implementing  network  security  measures  that 
ensure  quality  of  serviee,  information  assuranee,  and  meet  the  needs  of  the 
end  user? 

D,  BENEFITS  OF  THIS  STUDY 

This  study  is  intended  to  support  improvements  to  DoD’s  and  service 
eomponent’s  Computer  Network  Operations/Computer  Network  Defense  risk 
management  process  and  support  for  the  military’s  network  management  team  as  they 
examine  risks  to  the  present  system  of  networks,  whieh  provide  information  to  the 
warriors  and  to  the  network  systems  under  development  comprising  the  target  GIG.  This 
work  encourages  the  incorporation  of  a  network  enterprise  risk  management  approach 
when  making  deeisions  about  acquisition,  design,  development,  and  operation  of  military 
networked  systems. 

E,  SCOPE  AND  METHODOLOGY 

The  scope  of  the  thesis  is  on  the  investigation  of  network  risk  management  plans 
by  other  organizations  and  the  applieability  of  some  of  the  plans’  elements  to  a  generic 
network  enterprise  risk  management  for  DoD,  Joint,  and  Navy  use.  As  part  of  the 
development  of  the  above  elements,  the  thesis  investigates  current  models  used  in  risk 
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assessments  (CORAS,  CRAMM,  OCTAVE  to  name  a  few)  and  the  possibility  of 
adapting  or  developing  a  model  for  use  as  a  deeision  tool  in  the  military’s  network 
enterprise  risk  management  proeess.  The  methodology  entails  the  following. 

•  Researeh  in  literature  and  by  seleeted  interviews  into  present  risk 
management  proeesses  in  the  military  and  in  other  organizations  with 
eomplex  networked  systems  (preferably  arehiteeted  using  SOA) 

•  An  evaluation  of  eurrent  risk  management  plans  and  models  and  an 
analysis  of  the  models’  applieation  to  the  military  network  system 

•  An  investigation  of  some  architeetural  frameworks  that  might  serve  to 
improve  the  risk  of  operating  network  systems  or  serves  to  inform  network 
designers  of  limitations,  eonstraints,  and  assumptions  arising  in  the  design 
of  networks  when  a  proper  assessment  of  network  risk  is  eondueted, 
looking  at  vulnerabilities  and  threats,  the  impaet  due  to  network 
degradation  or  loss,  and  the  value  of  proeedures  to  maximize  network 
availability  in  a  degraded  state  on  mission  aeeomplishment 
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II.  APPLICATION  OF  SYSTEMS  ENGINEERING  PRINCIPLES 
TO  NETWORK  ENTERPRISE  ARCHITECTING 

A,  INTRODUCTION 

This  chapter  discusses  the  architecture  of  a  network  from  the  hardware 
perspective  of  network  topology  and  from  the  software  perspective  of  open  architecture 
and  SOA.  It  examines  the  implications  of  managing  the  risks  associated  with  network 
operation  and  how  risk  management  is  affected  by  the  software  architectural  design  and 
the  arrangement  of  network  components.  The  chapter’s  brief  discussion  of  the  different 
layers  of  a  network  system  and  their  interaction  with  respect  to  network  security  and 
managing  risks  is  covered  more  fully  in  Appendix  A.  Subsequent  chapters  examine  the 
risks  inherent  in  conducting  computer  network  operations;  how  an  organization  might 
analyze  the  risks  to  network  operations  by  identifying,  assessing  and  managing  risk  both 
from  a  systems  view  of  the  network  enterprise  and  the  lower  level  view  down  to  the 
client  workstation  level;  and  how  process  models  that  employ  both  qualitative  and 
quantitative  methods  are  useful  in  informing  network  designers  and  operators  how  to 
mitigate  the  risk  of  attack  and  protect  the  valuable  assets  of  the  network  and  the 
information  in  it.  Commensurate  with  risk  analysis  is  the  study  of  how  the  survivability 
attributes,  such  as  fault  tolerance,  of  the  network  support  the  management  of  risk. 

The  architecture  of  a  network  is  comprised  of  a  number  of  elements.  These 
include  the  topology  of  the  network  (arrangement  of  nodes  and  connections),  the 
abstraction  of  information  as  it  travels  through  a  network,  the  standards  used  to  assemble 
a  network  and  the  standards  for  packaging  the  information  transiting  a  network,  the  type 
of  components  used  for  network  node  construction  (switches,  routers)  and  for  transit 
paths  (arcs)  through  the  network  (fiber-optic,  twisted  pair,  wireless,  satellite),  how  and 
where  network  components  are  acquired,  the  construction  and  control  of  the  interfaces, 
the  location  of  the  network’s  information  assets  (data,  programs),  how  and  where  the 
network  connects  to  other  networks  or  the  Internet  (gateways)  as  in  a  SOA,  and  the 
physical  location  of  various  components  interacting  with  the  physical  environment 
(shipboard,  desert,...). 
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Networks  are  architeeted  in  different  arrangements  to  provide  different 
eapabilities  while  effieiently  using  available  resources.  The  way  a  network  is  architected 
can  introduce  vulnerabilities  through  the  make-up  of  the  network  components,  by  the 
way  the  components  are  connected,  and  by  the  methods  employed,  and  the  layer  of  the 
network  where  they  are  employed  to  provide  protection  for  network  and  information 
assets.  As  network  vulnerabilities  are  discovered  and  an  assessment  of  the  risks 
associated  with  those  vulnerabilities  is  conducted,  it  is  important  to  determine  what 
network  quality  of  service  attributes  the  protection  services  and  countermeasures 
(technological,  procedural,  or  managerial)  seek  to  secure  or  enhance  through  the 
protection  of  data  and  program  resources,  and  how  these  attributes  can  be  secured 
through  innovative  system  design  and  network  architecting.  It  is  important  in  the 
development  and  operation  of  a  network  (as  well  as  any  system)  to  learn  how 
vulnerabilities  are  created  and  how  they  are  discovered.  This  information  provides  system 
developers  with  valuable  experience  to  draw  upon  when  developing  follow-on  systems. 
First,  a  brief  discussion  follows  about  the  timing  of  commencing  a  risk  analysis  on  a 
network  and  the  importance  of  making  risk  analysis  a  continual  process  to  take  advantage 
of  the  feedback  afforded  by  monitoring  the  success  of  mitigation  efforts  and  by  keeping 
the  analysis  current  as  the  organizations  objectives,  technology  and  threats  change  over 
time. 

It  is  the  contention  of  this  thesis  that  the  identification,  assessment,  and 
management  of  network  risks  need  to  be  done  early  in  the  system  engineering  cycle,  in 
tandem  with  system  concept  definition.  It  is  at  this  point  when  the  Concept  of  Operations 
is  the  guide  for  determining  the  system’s  functions  and  for  defining  the  system’s 
requirements  to  meet  the  intended  mission.  Whether  it  is  form  from  function  or  form 
driving  function,  the  architectural  foundation  of  the  system  must  include  an  assessment  of 
the  systems’  risk  level  from  threats,  and  how  the  systems  are  architected  to  mitigate  those 
risks  or  recover  from  an  attack.  The  same  is  true  for  developing  networks.  All  too  often,  a 
computer  network  and  the  interconnected  information  systems  are  a  collection  of  systems 
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connected  together  for  funetionality,  and  only  after  the  network  has  been  intruded,  are 
security  measures  integrated  into  the  network  system  by  means  of  software  patehes, 
whieh  are  put  in  plaee  to  mitigate  another  attaek  of  the  same  or  similar  oharaeteristies. 

Before  information  systems  beeame  a  ubiquitous  eommodity,  system  design  built 
in  safety  as  one  of  the  design  requirements,  and  system  safety  was  required  to  meet  striet 
speeifieations  in  eritical  eontrol  and  operating  systems  where  human  and  valuable 
property  were  at  risk.  With  the  advent  of  the  revolution  in  information  technology,  and 
probably  because  of  its  rapid  paee,  system  capability  and  the  tremendous  amount  of 
applieations  that  the  new  teehnology  brought  were  given  priority  over  safety  and  seeurity 
eonsiderations.  Often,  seeurity  was  not  just  seeond  on  the  priority  list;  it  was  almost 
ignored.  Thus,  with  the  systems  already  in  plaee  today,  seeurity  tends  to  be  more 
reaetive;  vulnerabilities  are  treated  with  patehing  to  seal  up  the  plaee  in  the  program  that 
has  already  been  exploited  (Davis,  2008).  The  same  philosophy  holds  for  the  way  many 
major  eommereial  software  products  on  the  market  are  developed  today.  Security  is 
covered  by  a  library  of  software  corrections  to  fix  the  vulnerabilities  discovered  by 
attaekers.  Vulnerability  libraries  keep  expanding  as  new  methods  of  malieious  behavior 
from  threat  agents  are  diseovered;  usually,  through  a  new  attaek  on  a  legitimate  system. 
At  least  the  U.S.  Government  is  trying  to  stay  one  step  ahead  though  the  Comprehensive 
National  Cyber  seeurity  Initiative  (CNCI)  (Germain,  2008).  CNCI  initiatives  are  an 
attempt  to  be  proaetive  (in  the  true  sense  of  the  word,  meaning  action  before 
eonsequenees)  by  looking  at  establishing  a  front  line  of  defense,  developing  eyber- 
oounterintelligenee  plans,  and  shaping  the  future  through  eyber-supply  ehain 
management,  deterrence,  and  defining  cyber-seeurity  for  critieal  infrastructures  (Davis, 
2008). 

Networks  are  often  erated  by  eonneeting  pre-existing  stand-alone  information 
systems  together,  often  on  an  ad-hoe  basis.  With  so  many  individual  systems  in  the  U.S. 
military  inventory,  the  DoD  has  decided  that  the  best  arehiteetural  standard  for 
eonneeting  legaey  systems  together  is  through  the  method  of  SOA.  One  of  the  advantages 
of  SOA  is  software  reuse.  Legaey  systems  in  use  today  that  were  not  developed  to 
withstand  the  threat  environment  as  it  is  today  are  networked  in  the  new  arehiteeture  to 
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avoid  redeveloping  the  functionality  these  systems  already  provide.  In  addition  to  putting 
risk  management  in  the  systems  engineering  process  from  the  beginning,  the  urge  to 
ignore  the  security  faults  that  legacy  systems  contribute  to  the  network  must  be  resisted. 
It  is  tempting  to  let  resource  constraints  (as  well  as  human  nature’s  resistance  to  change) 
drive  the  decisions  whether  to  take  the  additional  step  in  architecting  a  system  to  resist 
and  recover  from  attacks  as  well  as  architect  the  system  to  meet  capability  requirements. 
Additionally,  while  it  may  be  at  present,  legacy  systems’  functionality  will  not  be  stand¬ 
alone  in  the  future  under  the  vision  of  SOA.  The  SOA,  how  it  works  and  its  advantages 
and  disadvantages,  is  discussed  later  in  this  section.  It  is  envisioned  under  SOA  that 
legacy  systems  fit  into  the  network  architecture  to  connect  their  services  (functionality)  to 
the  information  grid.  Being  in  an  operational  status  in  their  life  cycle,  it  is  critical  that  a 
thorough  risk  analysis  be  done  at  the  interfaces  connecting  these  systems  to  the  larger 
network.  It  is  at  these  interfaces  where  a  threat  agent  is  most  likely  sought  to  penetrate 
and  harm  a  legacy  system’s  functionality.  Through  intelligent  software  architecture  of  the 
interfaces  to  legacy  systems  and  the  other  Net-Centric  Enterprise  Services  and 
applications,  risk  mitigation  strategies  can  be  implemented  to  provide  protection  to  these 
assets  and  to  the  information  and  control  required  of  them. 

B,  NETWORK  CHARACTERISTICS 

1,  Attributes  of  a  Network 

There  are  as  many  ways  to  design  and  connect  the  components  of  a  network  as 
there  are  networks.  No  two  are  exactly  alike.  Networks  can  be  characterized  by  the 
following. 

•  How  they  are  physically  or  virtually  (through  software)  hooked  together 
called  the  topology. 

•  The  basic  function  of  the  network  (e.g.,  data  storage  and  retrieval, 
command  and  control,  business  services,  collaboration,  supervisory 
control  and  data  acquisition). 

•  The  layer  of  abstraction  of  the  communications  between  nodes  in  a 
network  (data  layer,  network  layer,  session  layer,  presentation  layer) 

•  The  specifications  of  a  network  in  memory  capacity,  processing  power, 
signal  latency,  and  bandwidth. 


20 


•  The  number  and  types  of  eomponents  eomprising  the  network  (switehes  or 
routers,  eentral  proeessors  or  embedded  eontrollers)  and  the  conneeting 
deviees  used  (Ethernet,  synehronous  optieal  network  (SONET),  optieal 
fiber). 

•  The  degree  of  aeeessibility  or  elassification  level  of  the  network. 

•  The  size  of  the  network  (Eoeal  Area  Network,  Wide  Area  network. 
Metropolitan  Area  Network)  and  its  diameter  (how  many  intereonneetions 
between  end-to-end  users). 

The  arehiteeture  of  a  network  depends  on  the  design  and  funetion  of  the 
individual  eomponents  (nodes)  and  the  way  they  are  eonnected  together  (arcs  of  a 
network)  to  achieve  an  enhanced  capability  through  their  connectivity.  Network  node 
basic  functional  characteristics  can  be  described  with  the  following  parameters  when 
relaying  packets  or  frames  (wireless). 

•  Memory  (buffer)  capacity 

•  Processing  speed  (switching/relaying) 

•  The  communications  connection  (arc  in  a  nominal  network)  has  the 
following  performance  parameter: 

•  Bandwidth  (throughput  rate) 

•  Interface  processing  speed  (from  Ethernet  to  fiber  optic  signals) 

The  goal  in  a  network  path  is  to  get  the  signal,  undistorted,  through  the  network 
the  quickest  way  under  high  bandwidth  and  low  latency  conditions.  The  parameters  are 
constrained  by  signal  latency,  which  is  the  time  it  takes  for  the  signal  to  get  from  the 
source  to  the  destination.  This  is  determined  by  the  number  of  nodes  (switches),  the 
switching  time  for  each  switch  it  has  to  transit,  and  the  time  it  takes  for  the  signal  to 
travel  through  the  communications  cable.  The  bandwidth  determines  the  amount  of 
information  that  can  be  sent  at  once.  Each  time  a  signal  passes  through  a  node  on  the  path 
between  source  and  destination,  it  is  called  a  “hop”  in  network  terms.  Since  the  travel 
time  of  light  through  the  cable  is  a  small  fraction  of  the  time  for  a  switch  to  relay  the 
signal  between  communication  paths,  the  biggest  contributor  to  latency  is  the  number  of 
hops  the  signal  makes  before  it  reaches  its  destination  (Sterbenz,  2006). 

A  network  diameter  is  the  topographically  farthest  distance  that  a  signal  can  go 
from  source  to  destination.  The  “edge”  of  a  computer  network  is  comprised  of  those 
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nodes  that  do  not  act  as  relays  for  other  signals.  The  number  of  hops  a  signal  must  take  to 
get  from  edge  to  edge  is  a  measure  of  the  network’s  diameter.  Thus,  a  network  is 
bounded  by  its  diameter.  The  Internet  is  described  as  unbounded  because  it  is  so  large 
and  it  does  not  appear  to  have  an  edge  (Sterbenz,  2006). 

Aggregation  in  a  network  means  connecting  nodes  to  a  central  point.  For  a  given 
number  of  nodes  in  a  network,  higher  aggregation  of  the  network  means  smaller 
diameter.  More  of  the  nodes  are  connected  in  a  star  pattern  to  a  central  routing 
mechanism  as  is  shown  in  Figure  4. 
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Low  Aggregation;  High  Diameter 


High  Aggregation;  Low  Diameter 


Knowing  the  “shape”  and  number  of  connections  in  a  network  points  the  architect  to 
vulnerabilities  that  can  be  corrected  by  changing  the  “shape”  without  sacrificing 
functionality. 


Figure  4.  Relationship  between  Diameter  and  Aggregation  in  a  Network 


Networks  can  be  aggregated  into  three  basic  types:  client-server,  peer-to-peer,  and 
a  hybrid  of  these  types.  In  the  client-server  arrangement,  a  component  called  the  server 
controls  the  network  communication  between  several  “client  computers”  connected  to  the 
server,  usually  connected  in  a  star  topology.  Peer-to-peer  has  no  one  component 
controlling  communication,  and  the  traffic  management  is  done  through  the  collaboration 


22 


of  the  “peers,”  or  active  computers  in  the  network,  that  are  communicating.  Client-server 
has  an  advantage  of  being  able  to  more  efficiently  get  signals  through  the  network  in  an 
ordered  pattern,  depending  on  the  network  management  program  in  the  server,  but  the 
disadvantage  in  an  intrusion  scenario  is  that  penetration  and  disruption  of  the  server 
affects  all  the  clients  attached  to  that  server.  In  wireless  networks,  transmission  range  and 
directional  coverage  are  used  to  aggregate  and  control  density.  In  high  transmission 
wireless,  everybody  is  connected  to  everybody.  In  low  power  wireless,  nodes  and 
overlays  are  used  to  control  density  and  network  diameter.  Wireless  network  attributes 
take  on  a  significant  importance  because  of  the  GIG’s  expanded  use  of  wireless  as 
opposed  to  the  heavy  land-line  use  in  the  Internet.  DoD  recognizes  that  with  mobile 
forces,  a  significant  amount  of  network  connectivity  is  wireless-based  (Sterbenz,  2006). 

The  scale  of  a  network  is  the  number  of  nodes  and  connections  between  nodes  in 
a  network.  A  network’s  scale  can  be  controlled  by  architecting  the  network  into  a 
hierarchy  or  by  clustering  segments  of  the  network.  In  addition,  the  clustering  controls 
system  state  as  each  cluster  in  the  network  can  retain  its  own  state  separate  from  the  other 
clusters.  Clustering  in  a  hierarchy  can  also  control  the  amount  of  aggregation,  thus 
limiting  the  effect  of  failure  in  a  central  node  and  not  allowing  it  to  affect  the  entire 
network  or  large  portions  of  it.  Clustering  also  supports  the  management  of  bandwidth 
allocation  within  the  hierarchy’s  sub-networks  so  that  bandwidth  through  the  larger 
network  is  managed  when  bandwidth  is  a  controlled  commodity  of  the  network.  Mesh 
networks  (a  topological  arrangement  discussed  below  that  has  the  network  nodes 
connected  to  every  other  network  node  I  through  its  own  arc)  form  natural  clusters  and 
can  scale  better  than  bus  networks,  allowing  more  versatility  and  adaptability  in  the 
network  architecture.  This  also  improves  the  network’s  resiliency  and  recoverability  after 
attack  and  network  fault  or  failure  (Sterbenz,  2006). 

2.  Network  Quality  of  Service  Attributes 

The  quality  of  service  (QoS)  attributes  that  need  protection  from  malicious  actors 
who  would  seek  to  disrupt  operations  are  confidentiality,  integrity,  and  availability 
(Davis,  2008).  It  is  the  goal  of  the  network’s  owners  to  ensure  that  the  data,  services,  and 
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the  control  of  critical  operations  are  (1)  available  when  and  where  they  need  them,  (2)  be 
unavailable  to  those  that  should  not  have  them,  and  (3)  be  uncorrupted  by  those  that 
should  not  have  access  to  them  so  that  the  desired  tactical,  operational  and  strategic 
effects  happen,  and  safety  of  personnel  and  systems  is  not  compromised.  Conducting  a 
risk  analysis  of  the  network  is  vital  to  understanding  how  at  risk  these  three  attributes  are, 
and  whether  or  when  they  may  be  compromised  because  of  system  unreliability, 
accidental  faulty  operation,  intentional  intrusion,  manipulation,  or  denial  of  network 
operations.  The  definitions  of  the  three  main  QoS  attributes  are  as  follows. 

•  Confidentiality  is  the  quality  attribute  that  information  is  seen  or  given 
only  to  those  authorized  to  see  it.  If  someone  unauthorized  unintentionally 
or  intentionally  can  gain  access  to  the  information,  then  confidentiality  is 
compromised  or  lost. 

•  Integrity  is  the  assurance  that  information  received  is  the  same  information 
that  was  sent;  nothing  added,  subtracted  or  altered.  If  data  integrity  is  lost, 
information  at  the  reception  end  sometimes  seems  ambiguous;  however, 
ambiguity  is  often  created  by  the  sender  and  should  not  necessarily  be 
attributed  completely  to  a  loss  of  network  integrity.  Information  whose 
integrity  has  been  compromised  is  difficult  to  detect  unless  there  is  a  way 
to  back  up  or  compare  the  quality  of  the  information  received  by 
information  from  another  source,  or  by  attaching  a  quality  code  (check 
sum)  to  the  information  sent. 

•  Availability  is  quantitatively  defined  as  the  percentage  of  time  that  the 
network  system  is  operating  as  intended  to  produce  the  effects  desired  by 
the  network  owners  and  users  (Hernandez,  2001). 

In  addition  to  these  three  attributes,  non-repudiation  (neither  sender  nor  receiver 
can  deny  sending  nor  receiving  what  was  sent  or  received)  and  authentication 
(verification  of  the  identification  of  people  and  information)  are  important  attributes  to 
protect  on  networked  systems. 

C.  NETWORK  HARDWARE  ARCHITECTING 

The  topology  of  a  network  is  the  arrangement  of  how  a  network’s  components  are 
connected,  physically  and  logically  through  software.  There  are  many  considerations  and 
motivations  for  choosing  a  particular  topology  for  a  network;  cost  and  capability  are 
probably  among  the  preeminent.  A  network  can  take  on  a  topological  arrangement  from 
its  design  and  over  time  may  take  on  another  shape  as  the  network  evolves  by  added 
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technology,  changing  requirements,  and  not  the  least,  by  its  addition  and  connection  to 
other  networks.  Among  the  motivations  for  choosing  a  particular  network  topology, 
architecting  it  to  protect  its  purpose  and  its  contents  and  to  contribute  to  the  protection  of 
the  networks  with  which  it  is  interoperable  or  loosely  connected  should  be  of  the  same 
importance  as  the  capability  it  provides.  No  matter  how  capable  the  network  and  its 
component  functions  are  or  what  contribution  it  makes  to  the  larger  network  enterprise,  if 
it  is  architected  with  vulnerabilities  that  can  easily  be  exploited,  any  contribution  to 
meeting  the  organization’s  objectives  is  most  likely  to  become  detrimental.  A  majority  of 
the  time,  a  compromised  network  can  aid  an  attacker  unknown  to  the  network’s  owners 
until  it  is  too  late. 

There  are  five  basic  types  of  network  topologies;  bus,  star,  ring,  tree  and  mesh 
(Kioskea,  2009).  While  the  pictorial  representations  of  these  arrangements  look  like  their 
descriptions,  the  actual  physical  arrangement  is  created  in  the  hardware  used  to  make  the 
connection  and  in  the  software  program  used  in  the  component  used  to  connect  them 
together.  While  the  basic  topology  of  a  network  is  the  connection  of  components  (as  in  a 
Local  Area  Network),  the  nodes  of  the  network  could  be  other  networks,  which  are 
attached  in  the  prescribed  arrangement.  These  topologies  are  discussed  below  (images 
obtained  from  Google  Pictures). 

1,  Bus  Topology 

In  the  bus  topology,  the  components  or  nodes  of  the  network  are  connected  to  a 
common  bus  known  as  the  “backbone.”  It  is  the  simplest  organization  of  a  network;  each 
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component  connected  by  communications  line  and 
their  hardware  adapters  to  the  common  bus.  It  is  also 
vulnerable  to  degradation  or  failure  should  one  of  the 
components  fail;  in  particular,  a  component  that  is 
attached  to  manage  the  traffic  across  the  bus.  While 


the  advantage  of  a  bus  is  that  it  is  easy  to  add  components  to  the  network,  this  makes  it 
easy  for  an  unwanted  component  to  add  itself  to  the  bus  and  gain  access  to  the  authorized 
components. 
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2,  Ring  Topology 

The  computers  in  this  topology  are  not  necessarily  arranged  in  a  physical  ring; 
however,  they  are  connected  by  software  that  handles  transactions  between  components 


in  a  ring  pattern,  by  handling  component  “broadcasts”  or  requests  for 
service  in  order  of  where  the  component  is  located  on  the  ring 
abstract.  Information  flows  within  this  connection  in  one  direction 
and  each  component  has  a  turn  to  transmit  or  receive  data  in  their 
order  in  the  ring.  The  traffic  management  is  usually  handled  by  a 


program  called  “Token  Ring”  or  fiber  distributer  data  interface  (FDDI).  The  FDDI 
architecture  can  provide  a  dual  ring  for  added  stability  and  network  recovery.  If  one  ring 
fails,  the  second  ring  picks  up  the  communications  management  task.  Under  a  dual  ring 
technology  with  a  concentrator  (multiplexor  to  combine  many  signals  into  one),  the 
individual  components’  failures  have  less  effect  on  the  network  performance  than  under  a 
single  or  dual  ring  topology.  Unlike  a  bus,  the  ring  topology  contains  components  in  a 
tighter  arrangement  with  more  resistance  to  outside  intrusion;  however,  the  basic  ring  is 
vulnerable  as  failure  of  a  component  or  one  communication  line  to  the  ring  causes  the 
entire  network  to  fail  and  cease  communication.  As  mentioned,  the  dual  ring  arrangement 
offers  recoverability  capability.  The  time  to  establish  the  second  ring  would  be  an 
important  specification  depending  on  the  service  or  data  requirements  of  the  network. 

3,  Star  Topology 

Star  topology  connects  the  components  of  network  together  by  communications 
line  to  one  central  location  called  a  hub.  Traffic  management  and  communications  order 


is  maintained  in  the  hub  to  send  the  communication  from  a  source  to  a 
destination  without  involving  the  other  members  of  the  network  who 
were  not  intended  to  receive  the  communication  from  the  source.  An 
increase  in  the  sophistication  of  the  hub  device  allows  for  more 


capability  in  traffic  management  and  detection  of  stray  or  unwanted  traffic.  The  hub 
could  be  merely  a  central  collection  and  distribution  point  or  a  switch  to  control  timing 
and  priority  of  message  delivery.  A  router,  as  the  central  point  of  the  network,  can 
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provide  additional  capability  in  the  connection  to  other  networks  and  as  a  firewall  for 
incoming  traffic.  A  router  is  a  network  hub  with  special  capabilities  to  bridge  to  other 
networks.  One  advantage  of  the  star  topology  is  that  the  failure  of  one  component  or  its 
communication  link  does  not  affect  the  rest  of  the  network,  unless,  of  course,  the 
component  that  fails  is  the  hub.  Star  topology  also  offers  the  shortest  latency  between 
nodes  on  the  star;  the  latency  time  dependent  mainly  on  switching  speeds  of  the  central 
hub.  However,  the  importance  of  the  hub  in  the  network  can  make  this  component  a 
single-point  failure  and  therefore  susceptible  to  attack. 

4,  Tree  Topology 

Tree  topology  is  a  collection  of  star  or  ring  topologies  or  individual  components, 
which  attach  to  a  central  bus  via  a  concentrator  component,  with  the 
concentrator/multiport  hub  that  acts  as  a  root  for  that  branch.  The 
concentrators  can  be  connected  in  a  hierarchical  fashion  with  the 
root  concentrator  managing  messages  for  an  upstream  concentrator 
as  well  as  other  individual  components.  This  arrangement  offers  the 
advantages  of  the  simplicity  of  a  bus  for  the  aggregation  of  the 
individual  network  arrangements  and  the  protection  of  the  star  or  a 
dual  ring  topology  for  each  individual  sub-network  attached  to  the  “tree.”  There  are 
multiple  points  on  the  tree  that  can  be  used  to  provide  distinct  levels  of  protection 
according  to  the  level  needed  within  the  sub-network.  Virtual  Private  Networks  (VPN)  or 
layer  networks  can  be  established  within  the  tree,  and  the  hierarchical  arrangement  of  the 
concentrators  can  provide  some  defense  in  depth  to  critical  components.  Recovery 
techniques  can  be  applied  to  the  entire  network  or  to  the  individual  branches.  It  is 
vulnerable  on  the  bus  backbone  to  other  networks  attaching  to  the  bus  causing 
degradation  to  the  rest  of  the  network  when  one  branch  has  a  fault.  It  provides  some  fault 
tolerance  when  the  fault  is  located  inside  a  branch. 

5,  Mesh  Topology 

A  mesh  network  connects  each  component  to  more  than  one  other  component  via 
a  dedicated  communications  channel.  In  a  true  mesh  arrangement,  all  components  are 
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connected  to  each  other  through  a  dedicated 
communieations  ehannel,  so  that  no  one 
component  is  eontrolling  the  communications  of 
any  other  network  eomponent.  As  a  network 
expands,  if  a  true  mesh  topology  were  used,  eaeh 
component  would  need  an  increasing  number  of  ports  to  connect  to  every  other  network 
node.  In  reality,  the  eonneetions  in  a  mesh  are  numerous  but  not  total,  and  some 
components  must  have  more  conneetivity  than  others  to  provide  some  type  of  traffic 
management  or  translation  of  information  if  some  nodes  proeess  information  differently 
than  others.  From  the  standpoint  of  vulnerability  to  intrusion,  the  mesh  topology  offers  an 
arehiteetural  design  with  very  good  resilience  in  that  a  disconneetion  in  one 
eommunieation  ehannel  ean  be  overcome  by  rerouting  through  a  different  path  sinee  all 
nodes  are  eonnected  multiple  ways.  A  mesh  network  can  be  eomplex  to  design  and 
fabrieate,  as  well  as  expensive  with  all  the  porting  and  ehannels.  Deeiding  on  the  best 
mesh  to  maximize  flexibility  and  resilieney  and  to  maximize  total  network  performance 
speeffications  of  signal  latency,  bandwidth  and  computational  power  ean  turn  into  a 
multi-objeetive  problem  very  quiekly.  As  the  diameter  (number  of  nodes  between  end-to- 
end  applications)  of  the  mesh  network  increases,  the  shortest  path  between  nodes  quickly 
becomes  constrained  by  individual  bandwidth  capabilities,  switehing  delays,  and 
eommunieations  link  distanees. 

D.  NETWORK  SOFTWARE  ARCHITECTING 

To  acquire  a  flexible,  adaptable,  and  resilient  global  network,  DoD  had  to  make 
several  system-of-systems  (SoS)  level  arehiteetural  design  decisions  as  it  moved  toward  a 
net-eentrie  philosophy  of  warfare.  Two  of  the  major  decisions  were  to  design  networks 
under  an  open  arehitecture  computing  environment  and  to  build  the  global  network  to 
distribute,  store,  and  operate  on  information  in  a  SOA.  This  deeision  required  the 
aequisition  and  incorporation  of  commercial  off-the-shelf  (COTS)  teehnology  into 
information  systems  and  networks  and  was  driven  by  the  faet  that  to  develop  this 
technology  in  house  would  be  too  eostly  and  untimely.  Chapter  V  diseusses  survivability 

of  a  network.  However,  it  is  important  to  note,  that  unlike  survivability  as  defined  for 
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weapons  platforms  in  battle  situations,  survivability  of  networks  is  protoeol-based,  not 
topology-based.  Thus,  it  is  the  interaetion  between  nodes  that  defines  how  the  network 
interfaces  are  designed  through  software  logic. 

1.  Open  Architecture 

a.  Open  Architecture  Computing  Environment  (OACE) 

The  use  of  Open  Architecture  standards  in  developing  information 
systems  and  networking  them  has  been  required  by  DoD  and  the  Navy  since  2004  (Naval 
Surface  Warfare  Center,  2004).  The  idea  of  open  architecture  is  to  use  COTS  products 
that  meet  common  industry  standards  and  to  incorporate  them  into  new  and  existing 
systems  in  a  modular  design.  The  Open  Architecture  Computing  Environment  (OACE) 
defines  the  key  systems  interfaces  with  commercial  standards  by  industrial  standards 
organizations. 

The  OACE  sets  the  standards  of  COTS  hardware  and  software 
components  and  systems  that  can  be  used  in  the  architecting  information  systems  and 
networks.  The  standards,  drawn  from  the  DoD  Joint  Technical  Architecture,  promote  user 
portability,  or  the  ability  to  develop  applications  that  interoperate  with  other  applications 
and  with  a  wide  range  of  suppliers  because  they  are  engineered  on  the  open  standards  for 
the  following: 

•  Communications 

•  Abstraction  of  services 

•  Application  Programmer  Interfaces 

The  OACE  is  based  on  a  reference  architecture  using  a  layered  approach 
so  that  specific  war  fighting  applications  can  ride  on  the  layers  and  can  interoperate  and 
communicate  with  other  mission-critical  applications  in  a  distributed  environment.  The 
standards  in  each  layer  are  as  follows  and  are  shown  in  Eigure  5. 

•  Applications:  Java  Programming,  Java  Community  Process 

•  Eanguage:  American  National  Standards  Institute  (ANSI)  programming 

C++ 
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•  Middleware:  Object  Management  Group  (OMG);  Common  Object 
Request  Broker  Architecture  (CORBA),  and  Data  Distribution  Service 
(DDS) 

•  Network  Operating  System:  POSK  Operating  System 

•  Networks:  Internet  Engineering  Task  Force  (IETF)  for  networks  and 
protocols 

•  Physical  Media:  Telecommunications  Industry  Association  (TIA)  fiber 
optics. 
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Network  layer  abstractions  allow  the  architect  to  focus  on  the  form  of  the  signal  as  it 
exists  in  that  layer,  enabling  better  architecting  decisions. 


Figure  5.  OA  Fayered  Approach  (From:  Naval  Surface  Warfare  Center,  2004) 


The  range  of  functional  applications  built  for  the  OACE  are  either 
integrated  or  federated.  Integrated  means  commonality  system-wide:  resource  sharing, 
enhanced  recovery  through  redundancy.  “The  integrated  approach  enables  mission 
flexibility  and  enhanced  failure  recovery  through  a  high  degree  of  redundancy  delivered 
via  operational  resource  sharing.”  (NSWC,  2004,  p.  12)  Federated  means  unrestricted 
choice:  maximum  flexibility  to  meet  unique  requirements. 
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TheOACE  runs  on  the  following  layers,  whieh  mimie  and  are  based  on  the 
four  of  the  ISO  layers. 

•  Physical  layer,  fiber  opties  earrying  multi-mode  messages  has  a 
wavelength  and  aperture.  Physieal  seeurity  ineludes  enelosures  to  provide 
shoek,  vibration,  and  other  proteetions  from  environmental  eonditions. 

•  Network  layer,  eonneetivity,  transfer,  and  support  protoeols.  Conneetivity 
is  the  data  link  layer  providing  logical  connectivity  (IEEE  802  and 
Ethernet).  Transfer  is  the  network  layer  (IP  and  routing  instructions  and 
QoS).  Support  protocols  are  the  many  session,  presentation  and 
application  protocols  for  communication,  file  transfer,  and  e-mail. 

•  Transport  and  sessions  layer:  The  network  operating  systems  in  this  layer 
provide  structure,  priority,  timing  to  comply  with  real-time  operating 
systems  to  provide  predictability.  However,  since  the  network’s 
predictability  is  only  as  good  as  the  most  unpredictable  component,  the 
thrust  of  the  operating  system  standards  is  based  on  the  portability 
concept,  a  network  operating  system  that  can  interface  with  other 
networks  to  which  it  is  connected. 

Two  types  of  middleware  are  resident  in  this  layer,  adaptive  and 
distribution  middleware.  Adaptive  middleware  isolates  the  application  from  the  network 
hardware  and  the  operating  system.  A  Resource  Manager  supports  computing  capability 
(management  to  provide  fault  detection,  tolerance,  recovery)  for  computers  that  are 
input/output  intensive,  computing  intensive,  or  memory  intensive. 

Eour  types  of  distribution  middleware  are  included  in  the  OACE: 
Distributed  Objects  protocol.  Distributed  Services  protocol.  Group-ordered 
Communications  protocol,  and  Message-passing  Interface  for  data  parallel  applications. 

•  Distributed  Objects  protocol  supports  data  exchange  by  invoking  methods 
on  program  application  or  data  objects  that  can  be  remote.  The  distributed 
object  protocols  allowed  by  OACE  are: 

•  Distributed  Component  Object  Model  (DCOM)  for  non-real  time 
business  enterprise  applications. 

•  Java/Remote  Method  Invocation  (RMI)  soft  real  time  for  decision 
aids 

•  Object  Management  Group’s  (OMG)  CORBA-for  soft  real  time 
command  and  control  and  hard  real  time  sensors  and  weapons 
control 
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•  Data  Distribution  Services  (DDS)  from  OMG  uses  data-centric, 
publish/subscribe  communications  control  for  Command  and  Control  and 
sensor/weapons  control.  Publish/subscribe  distributes  data  to  an 
application  that  declares  itself  a  member.  Data  is  distributed  from 
anonymous  servers  to  anonymous  clients,  as  the  information  is  not 
addressed  for  specific  routing  end-to-end. 

•  Group-ordered  Communication  protocol  provides  higher  level  of  delivery 
guarantees;  ordering  messages  to  maintain  consistency  of  state  between 
replicated  applications;  detecting  and  recovering  communications  failures. 
The  application  can  tell  what  communication  was  transferred  before 
failure  and  what  communication  replication  started  after  failure.  Fault 
tolerance  by  application  replication. 

•  Message  Passing  Interface  (MPI)  can  be  used  for  low/no-latency  sensor 
control  where  real  time  control  of  data  is  important.  Using  data  parallel 
techniques,  this  protocol  is  designed  to  handle  parallel  processing 
applications  such  as  signal  processing  and  for  communication  across  a 
back  plane  of  a  massive  parallel  processor. 

b.  Vulnerability  of  Open  Architecture 

Open  Architecture  has  the  advantage  of  reducing  development  costs  for 
new  system  software.  The  network  architecture  is  composed  in  modular  form  from  COTS 
products,  which  have  the  advantage  of  being  already  tested  to  a  limited  degree  for 
reliability.  However,  the  use  of  COTS  introduces  vulnerabilities  that  need  to  be  examined 
for  the  potential  of  their  exploitation  by  threats.  In  particular,  commercial  hardware  and 
software  has  limited  or  no  test  and  verification  pedigree,  and  limited  documentation 
inhibits  optimum  architectural  design  and  the  ability  to  determine  exact  reliability  or 
develop  certified  and  tested  recovery  procedures  (Anderson  &  Hundley,  1998). 

2,  Overlay  Networks 

Overlay  networks  use  software  programs  to  draw  on  the  topology  of  the  lower 
physical  layers  and  are  defined  by  special  procedures  for  linking  certain  nodes  in  a 
network  together  for  special  purposes.  A  VPN  is  an  overlay  of  an  existing  topology  in  a 
physical  network  to  provide  security  and  privacy  to  certain  nodes.  The  overlay  or  VPN 
can  be  scaled  by  physical  characteristics  of  the  communication  signal  or  by  authorization 
and  authentication  of  new  nodes  requesting  to  become  part  of  the  VPN.  An  overlay  VPN 
can  also  be  controlled  through  the  use  of  “Hash  Tables,”  which  are  a  form  of  intrusion 

32 


tolerant  multicast  protocol.  Hashing  assigns  a  non-descriptive  header  on  data  transmitted 
over  a  network,  so  that  intercepted  data  cannot  be  reconfigured  to  its  original  meaning. 
This  form  of  addressing  and  describing  data  that  flows  through  the  network  is  mainly  a 
device  for  database  access  when  it  is  critical  to  have  a  high  quality  of  service  in 
confidentiality  (Walker,  2008). 


Overlay  networks  are  an  architecting 
technique  that  uses  existing  topology  to 
provide  the  attributes  of  flexibility  and 
adaptability.  A  virtual  Private  Network  is 
and  example  of  a  network  overlay. 

Figure  6.  Example  of  Network  Overlay  (From:  Google  Network  Pictures,  2009) 

3,  Service  Oriented  Architecture 

a.  Service  Oriented  Architecture  Direction  for  the  Military 

DoD  and  the  military  services  have  several  thousand  applications  residing 
on  thousands  of  networks.  Many  of  the  applications  are  redundant,  but  they  are  accessed 
and  executed  by  different  means  and  in  different  languages.  Rather  than  discarding  all 
these  capabilities  from  numerous  programs  that  serve  various  functions  and  starting  over 
to  build  functionality  from  scratch,  DoD  and  the  services  are  migrating  their  future 
network  software  architecture  to  a  SOA  (DoD  GIG,  2007). 
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b.  Service  Oriented  Architecture  Definitions 

Vijay  Gehlot  (Slide  6,  2009)  paraphrases  Thomas  Erl’s  definition  of 
serviee  oriented  architeeture  as  a  model  in  whieh  functionality  is  decomposed  into 
distinct  units  (services),  which  can  be  distributed  over  a  network  and  can  be  combined 
together  and  reused  to  create  business  applications.  These  services  communicate  with 
each  other  by  passing  data  from  one  service  to  another,  or  by  coordinating  an  activity 
between  two  or  more  services.  SOA  draws  on  its  predecessor  concepts  of  distributed 
computing  and  modular  programming.  Technically,  the  communication  between  services 
is  defined  using  a  description  language.  The  services  have  callable  interfaces  that  are 
called  upon  to  perform  business  processes.  Each  interaction  is  independent  of  each  and 
every  other  interaction  and  the  Internet  protocols  of  the  communicating  devices.  Since 
interfaces  are  platform  independent,  a  client  can  use  the  service  from  any  device  using  an 
operating  system  in  any  language  (Gehlot,  2009).  The  communication  independence 
between  client  and  service  is  what  produces  a  loose  coupling  between  the  interfaces  of 
the  network  architecture.  SOA  is  similar  to  the  present  architecture  of  Web-service,  both 
of  which  use  a  service  registry  to  allow  a  consumer  of  a  service  to  discover  available 
services  through  the  Web  Services  Descriptive  language  (WSDE),  and  to  access  the 
service  through  an  XME-base  protocol  called  Simple  Object  Access  Protocol  (SOAP).  In 
SOA,  the  service  directory  and  service  description  are  contained  in  one  location  and 
communicate  under  the  Universal  Description,  Discovery  and  Integration  (UDDI) 
language.  Eigure  7  is  a  simple  picture  of  the  SOA  set-up. 
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Discover  Publish 

Service 
Registry 

Bind/Invoke 

Service  Service 

Consumer  Provider 

SOA  discovers  available  services  through  a  service 
registry,  which  decouples  the  service  from  the 
underlying  operating  system  of  the  consumer, 
protecting  that  service  from  consumer  malfunctions. 

Figure  7.  Service  Oriented  Architecture  Arrangement  (From;  Geholt,  2009) 

c.  Distributing  Services  under  Service  Oriented  Architecture 

SOA  is  desired  because  of  the  savings  on  time  and  money  by  the  reuse  of 

existing  software  and  services  available  through  various  programs.  By  structuring  the 

enterprise  network  so  that  these  services  are  accessible  to  any  authorized  user  without  the 

user  having  the  program  reside  in  the  user’s  memory  and  rewritten  in  the  user’s  language, 

these  services  can  be  distributed  and  used  by  anyone  on  any  system  platform  and  with 

any  computer  software  language.  When  a  service  is  needed  to  perform  some  operation, 

the  architecture  of  the  network  is  such  that  the  user  can  discover  the  service  desired,  the 

service  residing  in  some  distributed  location,  and  can  call  on  the  functionality  of  the 

service  even  though  the  service  application  may  not  be  in  the  same  software  language  as 

the  user.  The  product  that  the  user  receives  is  in  the  presentation  and  display  format  of 

the  user’s  workstation.  The  service  does  not  need  to  know  what  program  the  user  is 

running  to  provide  the  service  requested.  Through  the  technology  of  building  an  interface 

(SOA  interfaces  are  ubiquitous)  that  can  translate  the  language  of  the  user’s  application 

with  the  language  of  the  service’s  application  or  information  resource  of  one  kind  or 

another,  the  user  is  able  to  access  that  service  (if  authorized  and  authenticated).  Before  it 

enters  the  network,  requests  for  service  or  products  from  the  service  are  wrapped  in  the 

35 


extensible  markup  language  to  deseribe  what  the  information  is  so  that  when  it  arrives  at 
the  destination,  the  application  can  translate  the  data  into  a  form  it  can  use.  The  markup  is 
usually  done  at  the  network  enterprise  services  layer  of  the  network  (transport  and 
sessions  layer  of  the  ISO  model)  where  the  middleware  adds  onto  the  information  packets 
one  of  the  middleware  protocols  discussed  in  the  Open  Architecture  section  above  (e.g., 
distributed  objects,  distributed  data  services,  group  ordered  communication,  or  data- 
parallel  protocols  for  data-centric  handling).  Messages  between  nodes  on  a  network  are 
descriptive  rather  than  instructive  (loose  coupling),  and  the  messages  must  be  extensible 
(changeable).  Figure  8  is  a  graphical  description  of  the  way  SOA  is  architected  in  the 
GIG,  allowing  the  interconnectivity  between  units  and  services  so  that  all  can  share  in  a 
common  set  of  services. 


SOA  is  envisioned  to  provide  the  connectivity  between  different  warfare  areas 
to  promote  “Jointness”  and  information  sharing. 


Figure  8.  A  Conception  of  SOA  in  Defense  Applications  (From;  Gehlot,  2009) 
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d.  Advantages  of  SOA 

•  Software  reuse:  transparent;  neither  knows  what  applieation  the  other  is 
running.  In  this  way,  the  network  ean  be  arehiteeted  under  software  reuse. 
No  new  software  programs  need  to  be  developed  for  the  exchange  of 
services  except  for  the  software  in  the  middleware  program  that  translates 
the  service/data. 

•  Loose  coupling  between  client  and  service.  Coupling  in  simple  terms  is 
the  reaction  of  one  component  given  an  action  by  another.  Loose  coupling 
between  the  client  and  the  service  or  between  two  services  means  the 
actions  taken  by  one  program  may  be  felt  by  the  other  program  to  which  it 
is  loosely  coupled,  which  may  or  may  not  elicit  a  reaction.  In  tight 
coupling  the  reaction  mimics  the  action.  No  coupling  mean  there  is  no 
reaction  to  the  action  from  the  originator. 

Note:  During  the  beginnings  of  software  development,  instructions  to 
perform  certain  tasks  called  on  subroutines  to  perform  a  service.  The 
subroutine  was  part  of  the  software  program  and  was  tightly  coupled  to  the 
main  programming.  As  program  instructions  continued  to  grow  and  as  all 
services  could  not  reside  on  one  Central  Processing  Unit  (CPU),  a  local 
network  was  established  to  put  one  application  on  a  server  that  many 
clients  could  access  and  use.  This  was  followed  by  object  oriented 
programming,  where  the  services  were  called  upon  by  the  network  as 
objects  (programs  of  a  unique  type  that  could  be  used  and  delivered  as  an 
entity)  (Mahmoud,  2005).  Object-oriented  architecting  of  software 
contained  strong  links  between  service  provider  and  user,  and  a  change  in 
user  requirements  usually  meant  a  change  in  the  object’s  programming  to 
continue  to  be  used  by  the  new  client.  Under  SOA,  the  trick  is  the  design 
of  the  interface  between  service  and  user  to  loosely  couple  the  two  through 
software  programming  of  the  middleware  interface  to  the  network, 
allowing  for  changes  in  one  end  user  or  service  not  to  affect  the  other  end 
user  or  service.  The  language  translation  through  the  use  of  extensible 
Markup  Language  (XML)  and  later  more  sophisticated  versions.  Loose 
coupling  allows  for  the  rearrangement  of  the  different  services  without 
affecting  the  users.  This  allows  for  flexibility  and  resiliency  in  networks 
(NSA/IAD,  2008). 

•  Testing  new  software  applications  can  be  done  on  the  application  itself, 
and  not  depend  on  the  interface  method  except  for  the  interface  on  the  new 
program’s  end.  This  lessens  the  interruption  of  normal  operations  at  other 
client  sites  and  at  the  service  site. 
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e.  Challenges  of  Service  Oriented  Architecture 

•  Security  across  the  architecture:  While  the  loose  coupling  of  the  network 
connections  between  service  requester  and  service  provider  gives  the 
global  architecture  resilience  in  recovery  from  intrusion,  it  also  means  that 
the  system,  much  the  same  as  the  Internet,  is  virtually  unbounded  and  the 
number  of  users  accessing  services  is  unknown.  Unnecessary  requests  for 
service  or  unauthorized  service  requests  could  go  undetected  using  up 
valuable  bandwidth  and  possibly  compromising  the  confidentiality  of 
information  without  the  networks’  owners  discovering  the  loss  until  it  is 
too  late  to  recover. 

•  Testing  in  the  SOA  environment  is  complex  because  of  the  size  of  the 
global  network  and  because  of  the  complications  of  testing  COTS 
products.  Commercial  documentation  of  software  testing  may  be 
insufficient  to  uncover  the  faults  or  hidden  programs  of  the  code  when 
applied  to  a  military  application.  In  addition,  the  loose  coupling  at  the 
network’s  interfaces  makes  it  difficult  to  discover  a  root  cause  for 
problems  that  span  an  interface.  Software,  whether  commercial  or 
proprietary  to  the  organization,  needs  to  have  a  formal  process  of  quality 
control  during  testing  to  handle  the  unique  aspects  of  software  coding,  and 
architecture  to  limit  the  existence  of  malfunctions  and  paths  for  intrusion. 
One  such  process  that  offers  quality  control  in  software  design  is 
Capability  Maturity  Model  Integration  (CMMI).  Its  use,  while  explicitly 
addressing  quality  control  issues,  puts  the  development  of  software  one 
step  ahead  in  limiting  vulnerabilities  in  the  software  code  and  architecture. 
Vulnerability  in  software  development  and  acquisition  of  COTS  is  a 
manifestation  of  its  states,  and  controlling  the  states  supports  the  control  of 
vulnerabilities  in  the  software  (Chittister  &  Haimes,  2006).  Software 
testing  is  designed  to  evaluate  the  ability  to  control  system  states,  but  the 
complexity  of  software  testing  makes  it  virtually  impossible  to  determine 
the  ability  to  control  all  the  states  of  the  system. 

•  Managing  metadata:  networked,  distributed  services  allow  interception  of 
information  in  packets  while  being  routed  without  knowledge  of  either  end 
user. 

•  The  global  aspects  of  SOA  on  the  GIG  means  there  are  multiple 
connections  of  multiple  types  and  it  is  difficult  to  manage  the  security 
across  all  those  connections.  As  an  unbounded  network,  there  is  limited 
governance  from  a  global  perspective.  Governance  is  from  multiple 
sources  but  do  not  translate  easily  across  the  loose  coupling  of  the 
individual  network’s  interfaces  at  routers  and  other  gateways.  It  is  difficult 
to  monitor  remote  sites,  especially  if  they  are  mobile  and  in  a  hostile 
theater.  The  diversity  of  multiple  physical  data  transport  devices  and 
communication  links  (optical,  wireless,  satellite)  while  assisting  in 
survivability  through  redundancy  and  diversity,  causes  problems  with  state 
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awareness  in  conneeting  networks.  Interface  connections,  while  loose, 
may  have  threat  agents  resident  on  the  connected  network  that  are 
unknown  to  the  service  requester.  In  addition,  configuration  control  may 
be  an  insuperable  task  on  a  global  scale,  and  configuration  control  loses 
some  of  its  meaning  when  the  point  of  SOA  is  to  connect  divergent 
applications  across  a  transparent  interface. 

E.  NETWORK  ARCHITECTURAL  VULNERABILITIES  WITHIN 

NETWORK  LAYERS 

1.  ISO  Model 

As  discussed  at  the  beginning  of  this  chapter,  a  network  is  a  complex  organization 
of  physical  components  and  connecting  devices  arranged  in  a  certain  topology,  signal 
paths,  software  logic  controlling  the  signals,  protocols  which  define  how  information  is 
packaged,  logic  programs  controlling  the  routing  of  the  packets  of  information,  packet 
addressing  schemes  to  get  the  information  from  source  to  destination,  software  programs 
for  determining  who  gets  what  information,  mechanisms  to  keep  packets  from  interfering 
with  one  another,  the  data  and  functionality  contained  in  packets  sent  through  the 
network,  and  a  variety  of  other  schemes  for  making  the  network  operate  correctly  and 
perform  the  functions  desired.  The  ISO  has  decomposed  the  operation  of  networks  into 
seven  layers  in  its  OSI  model.  A  plan  to  evaluate  and  mitigate  risks  to  network 
enterprises  must  consider  each  abstraction  layer.  The  seven  layers  are  as  follows. 

1 .  Physical 

2.  Data  Link 

3.  Network 

4.  Transport 

5.  Session 

6.  Presentation 

7.  Application 

Figure  9  is  a  graphical  rendition  created  by  the  author  of  the  ISO  seven  layer 
abstraction  to  focus  on  the  interconnectivity  of  each  layer  and  what  function  each  layer  is 
performing  in  the  model. 
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Creating  the  signal  that 
will  carry  information 


Selecting  the  best  path 
from  source  to  destination 


The  information  from  source 
gets  mark-up  to  be  usable  at 
the  destination 


Creating  the  paths  that  the 
signal  can  travel 


Source  and  destination 
make  a  connection  to  pass 
information 


Where  information  is 
generated  or  used 


Physical  Layer  -  The  hardware  for  the  electrical  signal  to  flow  through  -  - 


An  understanding  of  the  form  a  signal  or  packet  in  each  network  layer 
abstraction  supports  architecting  a  resilient  network  at  each  layer. 

Figure  9.  Graphical  Interpretation  of  Network  Layer  Abstractions 

2,  Vulnerabilities  and  Mitigating  Strategies  within  the  ISO  Layers 

Table  1  is  a  synopsis  of  an  analysis  of  vulnerabilities  in  network  architecture  as 
seen  through  the  information  abstraction  of  the  ISO’s  network  layer  definition.  Appendix 
A  provides  an  expansion  on  each  of  the  layers  in  the  table. 
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International  Standardization  Organizal 

tion  (ISO)  Seven  Layer  Reference  Model 

Layer 

Quality  of  Service 
Attribute 

Vulnerability 

Mitigation 

Strategy 

Network  abstraction 

Attribute  Information 
Assurance  is  trying  to 
protect 

What  could  go  wrong 

What  can  be  done  about  it 

Application 

Services  (e-mail,  video 
stream,  computation, 
collaboration) 

Confidentiality,  integrity 

Fabrication,  interception, 
and  modification,  of 
information  inside 
firewalls  and  security 
management 

Data  replication,  diversity, 
distribution  and  multiple 
sources  of  like  services, 
multiple  duplicated  users. 

Presentation 

Formatting,  encryption, 
data  compression 

Integrity 

Information  markup  in 

XML  corrupted  to 
intercept  data,  or  incoming 
markup  corrupted  to 
deliver  malicious  program 

Authentication,  encryption, 
PKI,  OS  protection  and 
virus  data  base  updates, 
access  control,  distribution. 

Session 

Setup  and  management  of 
session 

Integrity,  non-repudiation 

Session  interrupted,  or 
joined  by  unauthorized 
node  capable  of  hijacking 
or  eavesdropping  on 
session 

Authentication,  access 
control,  multicast,  process 
and  execution  timing, 
intrusion  detection, 
diversion, 

publish/ subscribe  schemes 

Transport 

Complete  messages  and 
e2e  recovery 

TCP 

Availability 

Intruder  penetrates  hole  in 
VPN  or  layered  network, 
or  sets  up  unauthorized 
receipt  permission  at 
unauthorized  node  or 
hijacks  intermediary  node 
for  DDoS 

Behavior  and  pattern 
recognition,  adaptive 
router  reconfiguration, 
deception,  secure 
protocols,  middleware/box 
management,  secure  socket 
layers,  VPN 

Network 

Packet  flow  to  establish 
connectivity  between  many 
links,  provides  basis  for 
network  management 
services 

IP 

Availability,  integrity 

Network  path  management 
reconfigured  for  worst  case 
routing  or  interception  of 
packets  on  transparent 
internet 

Intrusion  detection, 
layered  and  mesh 
networks, ,  router  access 
control  and  DNS  server 
reverse  lookup 
modification,  IP 
repackaging  (anti-spoof), 
distribution 

Data  Link 

Packets  on  one  link 

Availability, 

Confidentiality 

Packets  on  open  net  are 
intercepted  and  modified 
or  dropped 

Packet  verification  and 
checking,  redundancy 

Physical 

Hardware  and  bit  stream; 
cabling 

Availability 

Processor  or  memory  chips 
from  commercial  source 
with  secret  code  to  reroute 
network  traffic  or  intercept 
security  management 
procedures.  Electro¬ 
magnetic  interference  in 
unshielded  systems  at 
remote  locations  in  foreign 
theater. 

Redundancy,  diversity, 
separation,  physical  access 
control,  secure  backplane 

Each  layer  has  its  own  contribution  to  providing  a  certain  level  of  quality  of  serviee  to  the 
information  flowing  through  the  network. 

Table  1 .  Network  Vulnerability  and  Mitigating  Strategies  within  Network  Layers 
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F.  SUMMARY:  NETWORK  ARCHITECTURAL  CONSIDERATIONS  IN 

MANAGING  NETWORK  RISK 

A  network  can  be  characterized  by  its  physical  and  its  logical  attributes.  The 
hardware  and  software  architecture  of  a  network  is  a  key  factor  in  the  vulnerabilities 
introduced  into  the  design  of  a  network  system.  When  assessing  the  risk  to  a  network, 
knowing  the  vulnerabilities  that  come  from  the  architecture  allows  decisions  to  be  made 
when  considering  alternative  architectures  to  minimize  the  vulnerabilities  to  the  assets  of 
the  network. 

The  topological  arrangement  of  a  network  is  the  hardware  architecture,  and 
different  topologies  introduce  different  vulnerabilities.  Mesh  network  architecture  has  the 
attribute  that  it  offers  the  network  superior  resilience  from  attack,  but  these  systems  are 
costly  and  complex  to  set  up.  The  software  network  management  of  a  mesh  network  can 
be  quite  convoluted  and  hard  to  monitor  the  effectiveness  of  the  management  and 
security  of  the  network.  However,  if  the  threat  of  attack  against  the  value  of  the  assets  in 
this  type  of  network  is  high,  the  cost  and  the  complexity  of  design  and  installation  may  be 
worth  it. 

The  GIG  is  a  combination  of  many  architectural  topology  arrangements,  riding  on 
the  backbone  of  the  DISA  intranet  bus.  For  this  reason,  a  comprehensive  enterprise  risk 
management  program  has  to  consider  the  GIG  architecture  from  the  top  down  in 
aggregation  and  from  the  bottom  up  as  each  type  of  topology  used  in  interconnecting 
networks  affects  the  enterprise  vulnerability  picture. 

In  the  risk  assessment  of  the  GIG’s  software  architecture,  the  decision  to  use  open 
architecture  and  to  take  advantage  of  the  attributes  of  a  SOA  carries  with  it  several  area 
that  can  introduce  vulnerabilities.  Chief  among  these  is  the  fact  that  Open  Architecture 
and  SOA  is  built  from  commercial  products  (COTS)  that  may  be  questionable  as  to  the 
testing  conducted  before  use  in  the  military’s  network  systems  or  whether  there  are 
hidden  software  programs  or  logic  that  can  introduce  unknown  vulnerabilities  that  appear 
during  network  operation.  Testing  software,  especially  commercial,  is  problematic  in  that 
not  all  paths  and  software  states  can  be  tested  within  a  reasonable  amount  of  time  or  at  a 
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reasonable  eost.  Quality  eontrol  in  the  design  of  new  software  and  integration  of  legaey 
and  COTS  software  ean  provide  some  assuranee  that  many  of  the  hidden  vulnerabilities 
are  uneovered  and  eorreeted  before  that  part  of  the  network  is  put  into  operation. 
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III.  RISK  MANAGEMENT  OVERVIEW 


A.  INTRODUCTION 

This  chapter  defines  risk  and  the  terms  used  in  the  DoD  risk  management  proeess. 
The  proeess  is  shown  to  result  in  the  benefit  of  aehieving  the  organization’s  objeetives 
when  used  in  a  meaningful  program  that  iterates  the  proeess  throughout  the  lifeeycle  of 
the  program  or  system  under  assessment.  The  next  ehapter  focuses  on  the  customizing  the 
risk  management  proeess  to  a  eomputer  network,  ineluding  loeal  networks  and  a  network 
enterprise  system.  Eaeh  chapter  eoneludes  with  a  summary  of  the  reasons  why  it  is  time 
and  money  well  spent  to  proteet  and  seeure  the  U.  S.  military’s  network  assets  and  the 
information  and  funetionality  eontained  in  it  by  formalizing  a  eontinuous  network 
enterprise  risk  management  program  throughout  the  network  enterprise  life  eycle. 

B,  RISK  CATEGORIES  AND  DEFINITIONS 
1.  Risk 

“Risk  is  the  measure  of  the  probability  and  severity  of  adverse  effects” 
(Lowrance,  1976;  Chittister  &  Haimes,  2006,  p.  5).  This  classic  statement  of  risk  is 
simple  but  powerful.  It  ean  apply  to  a  wide  range  of  applieations  ineluding  the  operation 
of  any  system  and  the  eonduet  of  any  program.  However,  in  applying  this  definition  to 
any  projeet  or  system,  a  major  ehallenge  is  to  develop  a  meaningful  and  valid 
measurement  of  risk,  and  deciding  what  is  done  with  that  measurement  onee  it  is 
obtained.  There  are  different  types  of  risk  depending  on  what  system  or  proeess  on  whieh 
the  level  of  risk  is  being  analyzed  and  at  what  level  of  the  system  or  proeess  is  being 
analyzed. 

•  At  the  elementary  level  of  a  system  or  subsystem,  the  major  concerns  about  risks 
to  the  system  are  from  the  standpoint  of  the  system’s  safety  and  the  ability  to 
prevent  harm  to  people  or  property. 

•  At  the  programmatie  level,  there  are  risks  to  the  sueeessful  eompletion  of  a 
proeess,  should  that  be  the  program’s  sehedule,  eosts  or  level  of  performanee. 
Risk  and  its  measurement  are  key  faetors  in  the  ultimate  definition  of  the 
program’s  sueeess. 
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•  There  are  risks  assoeiated  with  the  success  of  operations,  including  meeting  the 
objective,  having  the  desired  effect,  or  maintaining  an  uninterrupted  flow  of 
accurate  and  reliable  information  that  is  vital  to  the  operation’s  success. 

•  Risk  at  the  strategic  level  affects  decisions  on  allocation  of  resources,  campaign 
plan  and  direction,  manning  levels,  acquisitions  to  pursue,  or  policies  to  enforce. 

a.  Safety  Risk 

Looking  at  the  risks  to  the  safety  and  performance  of  systems,  the  U.S. 
Navy’s  Systems  Commands  define  risk  in  NAVSEAINST  5000.8  (DoN,  NSRMP,  2008) 
as  “Risk  is  the  potential  for  mishaps  or  other  adverse  variation  in  the  cost,  schedule  or 
performance  of  a  program  or  its  products.”  In  the  operation  of  a  system,  the  desired 
outcome  is  for  the  system  to  operate  as  it  is  designed.  The  risk  comes  from  the  likelihood 
of  the  system  not  operating  as  designed,  and  if  that  likelihood  is  realized,  the 
consequences  of  degraded  or  failed  performance  in  the  safety  to  personnel  and  equipment 
including  the  people  and  property  outside  the  boundaries  of  the  system.  The  Navy’s 
Operational  Risk  Management  Instruction  OPNAV  3500. 39. B  is  a  process  for 
discovering  and  dealing  with  safety  and  hazard  risks  at  the  unit  level  (Kujawski,  2009).  In 
his  explanation  of  the  correct  operation  of  safety-critical  computer  systems,  Neil  Storey 
defines  risk  as  “  ...  a  combination  of  the  frequency  or  probability  of  a  specified 
hazardous  event,  and  its  consequence”  (Storey,  1996,  p.  60).  Other  definitions,  which 
relate  to  risk  when  considering  system  safety,  are  as  follows. 

•  A  hazard  (natural)  or  threat  (human  initiated)  is  an  act  or  occurrence 
posing  a  potential  of  harm  to  a  person  or  thing. 

•  An  incident  is  the  occurrence  of  a  hazardous  event  which  has  the  potential 
to  cause  harm  under  different  circumstances. 

•  An  accident  is  the  unintended  occurrence  of  an  event  or  sequence  of 
events  that  causes  some  measurable  degradation  or  complete  failure  of  a 
system  or  harm  to  people  (Storey,  1996). 

b.  Financial  and  Program  Risk 

The  Risk  Management  Guide  for  DoD  Acquisition,  6**^  Edition  (DoD 
RMG,  2006)  and  the  Naval  Systems  Commands  Instruction  NAVSEAINST  5000.8 
(DoN,  NSRMP,  2008)  both  look  at  risks  to  a  program;  financial,  schedule,  and  personnel 
that  would  inhibit  a  program  from  meeting  key  objectives  in  those  areas.  Program 
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managers  are  direeted  in  those  instruetions  to  determine  what  might  threaten  the 
program’s  sehedule,  budget,  or  personnel  from  performing  as  required  to  meet  objeetives, 
and  to  weigh  the  probability  of  those  inhibitors  against  the  consequenees  to  the  program 
should  they  oeeur.  Armed  with  that  knowledge,  a  program  manager  ean  then  develop  a 
plan  to  deerease  the  probability  of  the  unwanted  events  causing  consequences 
unacceptable  to  the  program.  The  guide  defines  risk  by  the  following  statement:  “Risk  is 
a  measure  of  future  uncertainties  in  achieving  program  performance  goals  and  objectives 
within  defined  cost,  schedule  and  performance  constraints.”  Program  risks  have  three 
components. 

•  A  root  cause  (yet  to  manifest  itself),  which,  if  eliminated  or  corrected, 
would  prevent  a  potential  consequence  from  occurring 

•  A  probability  (or  likelihood)  assessed  at  the  present  time  of  that  future  root 
cause  occurring 

•  The  consequence  (or  effect)  of  that  future  occurrence. 

A  root  cause  is  the  most  basic  reason  for  the  presence  of  a  risk. 
Accordingly,  risks  should  be  tied  to  root  causes  and  their  effects  (DoD  RMG,  2006,  p.  1). 

c.  Operational  Risk 

Looking  at  risk  from  an  operational  level,  Bilal  Ayyub,  (2003,  p.  35)  says 
that  risk  can  be  framed  in  the  context  of  a  scenario  or  event  as  the  occurrence  likelihood 
and  occurrence  consequences  of  an  event.  It  is  also  the  potential  for  loss  or  reward 
resulting  from  exposure  to  a  hazard  that  if  realized  would  result  in  an  outcome  of  some 
measurable  significance  on  a  defined  population  of  people  and  machines.  Risk  is 
measured  by  defining  the  components  of  the  risk,  measuring  the  chance  or  probability 
and  measuring  the  potential  negative  or  positive  rewards  or  benefits  (Ayyub,  2003). 
Ayyab  is  talking  about  both  opportunity  and  adverse  risk. 

In  the  military  setting,  operational  risk  is  encountered  on  a  constant  basis 
from  the  theater  commander  to  the  unit  commander;  only  the  level  of  the  risk  to 
operations  differs  at  the  echelon  of  command  to  which  it  applies.  At  the  Combatant 
Commander  level,  risk  of  mission  success  or  failure  is  considered  when  determining 
courses  of  action.  In  the  doctrine  of  Effects  Based  Approach  to  Operations  (EBAO) 


47 


[Commander’s  Handbook  for  an  Effects-Based  Approach  to  Joint  Operations],  the 
mission  objective  is  to  achieve  an  effect,  or  a  change  of  system  state  of  the  adversary  to 
what  the  Combatant  Commander  desires.  Courses  of  Action  (CO A)  are  considered  to 
achieve  the  effect  desired,  and  each  COA  carries  a  probability  that  the  action  does  not 
achieve  the  effect  desired;  an  operational  risk  resulting  in  a  consequence.  An  additional 
risk  to  operations  from  the  EBAO  approach  is  the  risk  that  actions  result  in  unintended 
consequences.  “One  will  always  encounter  unintended  effects,  both  good  and  bad,  and 
those  that  extend  beyond  objective  accomplishment.  Improving  awareness  can  help 
anticipate  many  outcomes  and  mitigate  the  impact  of  unintended  negative  effects” 
(Hunerwadel,  2006,  p.  1).  Conversely,  a  military  adversary  is  also  trying  to  achieve  an 
effect  on  U.S.  forces.  The  operational  risk  comes  from  the  threat  of  enemy  action,  and  the 
consequences  if  the  threat  should  become  reality.  The  military  commander  must  consider 
what  motivates  the  enemy  to  act,  what  the  action  is,  what  can  be  done  to  decrease  or 
mitigate  his  forces  vulnerability,  and  what  the  consequences  are  if  the  threatened  action 
happens.  This  type  of  analysis  is  directly  applicable  to  the  way  the  risks  to  a  computer 
network  should  be  handled  as  is  shown  in  the  subsequent  chapters. 

d.  Enterprise  Risk 

By  the  very  nature  of  an  enterprise  being  an  integration  of  several  systems 
(system  of  systems  or  family  of  systems)  integrated  and  interoperable  to  some  extent  to 
achieve  a  common  objective  and  produce  the  desired  effects,  risk  to  the  enterprise  takes 
on  a  holistic  perspective  (Haimes,  2007).  Risks,  or  the  likelihood  of  occurrences  that 
would  hinder  desired  outcomes  that  achieve  the  ultimate  strategic  goals  of  the  enterprise, 
can  come  from  external  or  internal  sources  and  can  be  directed  at  multiple  objectives 
such  as  finance,  people,  processes,  and  operational  events.  Dealing  with  enterprise  risk 
requires  common  enterprise  understanding,  strategic  communication  planning,  cross¬ 
enterprise  alignment  and  sound  understanding  of  the  evolving  environment  (Kujawski, 
2009). 
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2. 


Risk  Analysis 


Risk  analysis  is  a  process  and  practices  to  identify  and  assess  risk.  Risk  analysis  is 
designed  to  answer  the  questions  of  what  could  go  wrong,  how  likely  is  it  that  it  would, 
and  what  would  be  the  consequences  if  it  happened  (Haimes,  2007;  Blanchard  & 
Fabry cky,  2006). 


a.  Risk  Identification 

The  identification  of  risk  is  the  process  of  ascertaining  what  could  go 
wrong.  Sources  of  risk  can  come  from  natural  or  man-made  hazards,  from  unintentional 
incidents  or  accidents,  from  unreliable  hardware  components  or  software  programs,  from 
software  that  has  been  unintentionally  or  intentionally  designed  with  flaws  or  faults,  and 
from  individuals  or  organizations  (cultural  or  political)  who  desire  to  tamper  with  and 
disturb  or  destroy  system  effectiveness.  Identifying  risks  of  parties  intent  on  harming  a 
system  requires  matching  the  threat  from  these  parties  with  the  vulnerable  set  of  system 
states  that  the  threat  can  exploit.  In  particular: 

•  Threats  and  threat  agents  are  entities  with  the  motivation  and  the 
capability  to  cause  system  disruption,  harm  or  failure.  A  threat  is  an  event 
that  has  not  happened  but  has  a  chance  of  happening;  thus,  a  probability  of 
occurrence  is  associated  with  the  threat. 

•  Vulnerability  is  the  degree  of  exposure  and  number  of  weaknesses  in  the 
system  a  threat  could  exploit.  Yacov  Haimes  and  Clyde  Chittister  (2006) 
further  define  vulnerability  in  software  engineering  as  “...the 
manifestation  of  inherent  states  of  a  system  that  can  be  exploited  or 
otherwise  aversely  affected...”  The  authors  also  say  that  to  be  able  to 
control  system  states  implies  an  ability  to  control  vulnerability.  For  the 
threat  to  exploit  the  vulnerability,  the  threat  needs  to  discover  the 
vulnerability  and  when  the  optimum  time  to  exploit  the  vulnerability 
would  be  to  achieve  the  effect  the  threat  desires. 

b.  Risk  Assessment 

Risk  assessment  is  a  combination  of  risk  identification,  likelihood,  and  the 

associated  consequences  (Haimes,  2007).  Assessment  of  risk  probably  involves  the 

application  of  processes  and  methodologies,  often  through  the  use  of  process  models, 

mathematical  formulations  or  simulation,  to  quantify  risk  elements  and  prioritize  them  for 

when  and  how  to  deal  with  them.  Quantification  of  the  impact  relies  on  the  assignment  of 
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probability  outcomes  or  values  to  the  effect  on  the  system  that  a  threat  capability, 
intention,  and  the  threat’s  progress  (again  a  temporal  attribute)  at  completing  the  intended 
threat  objeetive.  The  combination  of  the  probability  of  the  likelihood  that  a  threat  exploits 
a  vulnerable  system  element  and  the  impact  that  could  happen  quantifies  the  risk  to  a 
system  for  a  given  scenario.  A  prioritized  list  can  thereby  be  generated.  Risk 
quantifieation  is  a  difficult  concept  to  grasp  let  alone  quantify  into  a  metric. 

3,  Risk  Management 

Risk  management  involves  the  determination  of  what  should  be  done  about  the 
risks  identified.  Risk  management  is  the  proeess  of  making  management  deeisions, 
implementing  the  deeisions  (take  aetion)  based  on  risk  assessment,  eontrolling  the 
identified  risks  and  tracking  the  results  of  notions  taken.  Follow-up  is  equally  important 
and  involves  taking  further  action  based  on  the  effectiveness  of  the  initial  actions,  and 
continuing  to  monitor  the  environment,  looking  for  ohanges  to  the  environment  that 
would  change  or  change  the  oourse  of  action  as  the  level  of  risk  ohanges.  Onoe  it  is 
determined  what  could  go  wrong,  how  likely,  and  the  impact,  it  is  then  inoumbent  on  the 
analyst  to  discern  what  can  be  done  about  it,  what  trade-offs  can  be  made  to  decrease 
risk,  and  what  effeot  the  decisions  to  take  actions  to  improve  the  measure  of  risk  have  on 
the  future  operation  of  the  system  (Haimes,  2007).  Depending  upon  the  degree  of  risk  and 
the  organization’s  toleranee  of  risks,  coupled  with  the  importanee  of  accomplishing  a 
given  objeetive,  the  organization’s  action  on  the  risk  assessment  yields  the  strategy  to 
deal  with  risk  elements  by  avoiding,  aeeepting,  transferring  or  mitigating  the  risk.  The 
subeategories  of  managing  risk,  defined  below,  are  the  steps  in  the  proeess  for  handling 
identified  risks.  These  categories  are  mostly  aligned  with  the  DoD’s  definitions  and 
proeesses  for  risk  management,  but  some  go  into  greater  detail  than  the  DoD  Risk 
Management  Guide  (DoD  RMG,  2006). 

a.  Risk  Mitigation  Planning 

Planning  for  risk  mitigation  is  the  activity  of  examining  courses  of  action 
that  decrease  or  eliminate  a  threat  posing  a  risk;  patching  or  eliminating  system  or 
program  vulnerability  threat  could  exploit;  or  changing  the  importance  of  aecomplishing 
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desired  objeetives  to  deerease  impaet  or  change  the  consequences  if  a  risk  becomes  a 
reality.  Mitigation  can  be  accomplished  through  technological  means  or  process  and 
procedures  (e.g.,  system  operating  rules,  personnel  selection  and  training)  (DoD  RMG, 
2006). 


b.  Risk  Mitigation  Implementation  and  Plan  of  Action 

Risk  management  does  not  end  with  the  collection  of  ideas  of  what  can  be 
done  with  the  risks  identified  and  assessed.  The  decision  makers  must  decide  and  act  on 
the  priorities  established  and  implement  their  actions.  Part  of  the  decision-making  process 
must  be  an  evaluation  of  what  effect  the  decisions  have  on  the  future  operation  of  the 
system  and  what  new  exposures  might  be  created  because  of  their  actions  (DoD  RMG, 
2006). 


c.  Risk  Management  Plan  Tracking 

The  effects  of  the  actions  to  implement  must  be  monitored  to  see  if  the 
goal  was  achieved,  or  if  modifications  need  to  be  made  to  the  implementation  plan.  In 
addition,  any  changes  to  system  or  program  performance  must  be  monitored  as  the 
environment  surrounding  the  system  changes.  While  monitoring  the  system  response  to 
the  implemented  plan,  analysts  should  determine  if  the  reaction  is  a  positive  or  negative 
result  of  actually  mitigating  the  identified  risk  or  other  changes  (DoD  RMG,  2006). 

C.  DECISION-MAKING  PROCESS 
1,  Risk  Factors/Influence 
a.  Influence 

Understanding  how  risk  affects  the  operation  of  an  enterprise,  which 

definitely  encompasses  probability  and  consequences,  is  fundamental  to  making  good 

decisions  in  managing  the  risk.  Part  of  that  understanding  comes  from  knowing  what 

influences  the  decisions  made  and  what  influence  those  decisions  have  on  subsequent 

outcomes.  In  turn,  this  motivates  subsequent  actions  taken  in  response  to  the  results  of  the 

first  decisions.  The  experts  in  the  field  of  risk  analysis  (Clemen  &  Reilly,  2001;  Haimes, 

2009)  advocate  using  influence  diagrams  to  aid  in  the  visualization  of  the  consequences 
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of  making  risky  decisions.  Influence  diagrams  graphically  display  the  decisions,  chance 
events  and  scenarios,  and  outcomes  with  arcs  representing  the  direction  of  influence  and 
sequence.  An  influence  diagram  helps  the  decision  maker  put  decisions  and  alternatives 
in  context  and  to  visualize  the  factors  that  affect  the  desired  outcome  or  effect.  Influence 
diagrams  can  be  used  for  both  opportunity  and  adverse  risk  scenarios.  When  used  to 
display  decisions  for  risk  management,  the  diagrams  are  mainly  utilized  to  show  the 
relationship  between  actions  and  outcomes  as  a  result  of  adverse  chance  situations  (risk) 
(Clemen  and  Reilly,  2001;  Blanchard  &  Fabrycky,  2006).  Figure  10  is  an  example  of  a 
simple,  one-objective  influence  diagram  on  the  decision  whether  to  install  a  firewall  in 
the  network  or  not.  Each  action  combined  with  an  element  of  chance  results  in  an 
outcome. 


Install  firewall 
upstream  of 
network 
gateway 


Decisions  produce  outcomes,  and  knowing  the  influences 
on  the  outcomes  aids  in  making  better  decisions 


Figure  10.  Basic  Influence  Diagram  on  Risk  to  a  Network 
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In  the  diagram,  a  rectangle  represents  a  decision,  a  rounded  rectangle 
represents  an  objective,  an  oval  is  a  chance  event,  and  a  diamond  is  the  final  outcome, 
final  consequence,  or  “overall  satisfaction.  The  arrows  are  arcs  representing  either  a 
sequence  of  events  or  relevance  of  an  event.  In  the  figure  above,  both  arrows  are 
sequential;  a  decision  is  made  and  a  chance  event  may  or  may  not  happen.  The  outcome 
of  whether  database  confidentiality  is  maintained  or  not,  is  determined  by  a 
decision/action  and  by  the  result  of  a  chance  event.  The  idea  of  actions  and  chance  events 
coupled  to  yield  an  outcome  is  executed  further  in  the  next  discussion.  However,  in  the 
influence  diagram,  measures  of  probability  of  the  chance  event  and  the  decision  variables 
are  not  graphically  shown.  The  idea  of  the  diagram  is  to  show  the  relationship  between 
events  of  chance  and  actions.  In  a  slightly  more  complex  situation,  an  influence  diagram 
could  be  used  to  show  multiple  chance  events  impacting  on  intermediate  consequences, 
which  arise  out  of  the  desire  to  meet  two  or  more  objectives.  Figure  11  illustrates  the 
decision  whether  to  install  an  upgraded  firewall  upstream  of  the  network’s  gateway  to  the 
Internet  is  influenced  by  two  chance  events,  the  intruder  again  and  the  decision  maker’s 
budget. 
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Knowing  the  influences  causing  a  mind-set  for  making  a  certain  decision 
supports  better  decision  making 


Figure  1 1 .  Multiple  Objective  Influence  Diagram  on  Risk  to  a  Network 

Two  intermediate  objectives  precede  the  outcome  of  whether  database 
confidentiality  would  be  compromised;  stronger  firewall  protection,  and  because  the 
firewall  is  on  the  Internet  side,  not  requiring  encryption  within  the  network’s  LAN 
architecture  to  save  on  the  budget  and  to  increase  data  accessibility  by  authorized  users. 
The  figure  shows  how  the  chance  event  of  the  budget  amount  has  relevance  to  the 
decision/action  to  install  the  upgraded  firewall  and  to  the  intermediary  consequences  of 
stronger  firewall  protection  and  “no  encryption  inside  gateway”.  The  chance  event  of  an 
intruder  gaining  access  to  the  LAN  has  relevance  to  both  intermediary  objectives  also. 

The  diagrams  above  are  grossly  simplified.  It  would  be  advantageous  to 
carry  the  analysis  one  step  further  to  diagram  how  the  outcome  of  loss  of  confidentiality 
to  the  database  would  influence  the  desired  effect  of  the  network  enterprise,  e.g.,  the  loss 
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of  confidentiality  on  this  network  would  give  an  adversary  aeeess  to  vital  eampaign  plans 
and  operations  plans,  whieh  would  have  to  be  redone  to  gain  the  advantage.  A  thought  to 
eonsider:  the  loss  of  the  campaign  plans,  although  highly  unlikely  beeause  this  partieular 
network  is  isolated/disconneeted  from  the  Internet,  would  be  eatastrophie  to  an  already 
deployed  Army  Division  that  is  to  follow  those  plans.  This  faet  might  inerease  the  weight 
given  to  the  infiuenee  of  the  intruder’s  attaek  and  deerease  the  influenee  of  the  budget, 
eausing  a  realloeation  of  dollars  to  this  network  and  away  from  another. 

b.  Uncertainty 

In  his  book  on  Risk  Modeling,  Assessment,  and  Management  (3rd  ed.), 
Yakov  Haimes  (2009,  p.  158)  explains  how  most  deeisions  that  involve  a  ehanee 
(probability)  of  the  events  or  seenarios  happening  are  based  on  maximizing  the  expeeted 
value  of  the  outeomes’  “payoffs.”  However,  in  risk  management  basing  decisions  on  the 
expeeted  value,  or  median,  a  set  of  outcomes  resulting  from  aetion  taken  in  the  faee  of  the 
probability  of  a  set  of  seenarios  is  not  neeessarily  a  wise  deeision.  In  the  seetion  about  the 
fallaey  of  expeeted  value,  Haimes  makes  the  point  that  if  deeisions  were  based  on  the 
expeeted  value  of  the  outeome,  systems  would  be  eonstrueted,  ignoring  the  possibility  of 
the  low  probability  but  highly  eatastrophie  outeomes.  His  formulation  for  the  way 
deeisions  should  be  made  is  to  partition  the  probability  distribution  into  segments  and 
ealeulate  the  expeeted  value  in  eaeh  segment.  Even  though  the  eatastrophie  events  have  a 
very  low  probability  of  occurrenee,  human  behavior  and  preferenee  tends  to  skew  the 
integrated  result  toward  aetions,  which  protects  the  system  against  the  eatastrophie  event 
more  than  if  only  the  expeeted  value  of  the  total  distribution  were  used.  This  method, 
ealled  the  Partitioned  Multi-objeetive  Risk  Method,  more  realistieally  “eonditions”  the 
expeetations  of  the  decision  maker,  and  allows  for  a  practieal  decision  conclusion  in  the 
faee  of  risks  to  a  system,  espeeially  eonsidering  the  element  of  safety  risk  (Haimes, 
2009). 

As  is  the  case  in  most  real  world  systems,  the  probability  of  ehanee  events 
is  ineomplete  but  some  data  is  available  and  ean  be  used  to  develop  a  probability 
distribution  of  the  ehanee  events.  Two  methods  to  estimate  the  distribution  are  the 
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Fractile  method  by  dissecting  data  into  fractiles  usually  of  25%,  50%,  75%,  and  100% 
and  relating  the  outcomes  to  a  fractile.  Probability  density  and  cumulative  distribution 
functions  can  then  be  developed  to  give  a  probability  to  the  chance  events.  The  second 
method  is  to  obtain  expert  advice  on  the  lowest,  most  likely,  and  highest  value  of 
outcome.  From  this  assessment,  a  triangle  probability  function  is  constructed  with  the 
base  as  the  lowest  and  highest  outcome  values  and  the  apex  as  the  most  likely.  From  the 
probability  density  and  resultant  cumulative  distribution  functions,  probabilities  of  the  set 
of  outcomes  are  available  to  analyze  against  a  decision  to  be  made  in  light  of  the  outcome 
(Haimes,  2009;  Blanchard  &  Fabrycky,  2006). 

To  frame  what  is  actually  accomplished  in  a  risk  management  process,  the 
steps  used  by  the  DoD  Risk  Management  Guide  for  Acquisition  are  used  to  illustrate  how 
the  process  works  in  an  actual  very  large  organization  for  the  management  of 
programmatic  risk. 

D,  THE  DOD  RISK  MANAGEMENT  PROCESS 

The  sixth  edition  (2006)  of  the  DoD  Risk  Management  Guide  for  Acquisition 
describes  a  generic  process  model  for  managing  risk  in  acquisition  programs.  It  is  the 
DoD  guide  and  template  for  other  risk  management  schemes  for  processes  other  than 
acquisition  and  for  the  individual  service  components’  plans  for  risk  management  within 
their  service-related  programs.  The  steps  of  the  risk  management  guide  closely  follow  the 
steps  described  above  including  risk  identification,  risk  analysis,  risk  mitigation  planning, 
implementation,  and  tracking,  and  is  graphically  shown  in  Figure  12.  It  also  provides  a 
planning  guide  for  setting  up  a  system  of  risk  management  in  an  acquisition  program. 
The  guide  states  that  “DoD  risk  management  is  based  on  the  principles  that  risk 
management  must  be  forward-looking,  structured,  continuous,  and  informative.  The  key 
to  successful  risk  management  is  early  planning,  resourcing,  and  aggressive  execution” 
(DoD  RMG,  2006,  p.  22).  The  guide  exhorts  Program  Managers  to  evaluate  their 
programs  in  light  of  the  risk  to  meeting  cost,  schedules  and  requirements.  In  fact,  the 
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definition  of  risk  in  this  guide  is;  “Risk  is  a  measure  of  future  uncertainties  in  achieving 
program  performance  goals  and  objectives  within  defined  cost,  schedule  and  performance 
constraints”  (DoD  RMG,  2006,  p.  1). 
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The  basic  process  of  managing  risk  is  essentially  the  same  for  safety, 
finances,  or  operations.  How  it  is  applied  depends  in  large  part  on  the 
application.. 

Figure  12.  DoD  Risk  Management  Process  (From;  DoD  RMG,  2006) 

1,  DoD  Risk  Identification 

The  first  step  in  DoD’s  risk  management  process  is  risk  identification.  Risk 
identification  is  the  activity  that  examines  each  element  of  the  program  to  identify 
associated  root  causes,  begin  their  documentation,  and  set  the  stage  for  their  successful 
management.  Risk  identification  begins  as  early  as  possible  in  successful  programs  and 
continues  throughout  the  program  with  regular  reviews  and  analyses  of  Technical 
Performance  Measurements  (TPMs),  schedule,  resource  data,  life-cycle  cost 
information,...  (DoD  RMG,  2006,  p.  7). 
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2,  DoD  Qualitative  Risk  Analysis 

The  next  step  in  the  DoD  guide  is  risk  analysis,  a  step  performed  much  like  the 
risk  management  models  studied  in  the  next  chapter,  where  each  risk  element  identified 
in  step  one  is  analyzed  to  assign  it  a  likelihood  of  occurrence  and  an  impact  on  cost, 
schedule,  and/or  performance.  Figure  13  shows  that  the  analysis  is  conducted 
qualitatively  and  the  level  of  risk  is  reported  on  a  Risk  Reporting  Matrix. 


Likelihood 

increasing 


Qualitative  risk  assessment  places  the  decision  making  in  the 
right  context;  is  it  (the  risk)  something  to  be  concerned  about  or 
not. 

Figure  13.  DoD  Risk  Reporting  Matrix  (From;  DoD  RMG,  2006) 

The  numbers  for  likelihood  are  matched  to  a  table  of  probability  of  occurrence 
and  enumeration  of  consequence  is  matched  to  a  table  that  assigns  the  numbers  1  through 
5  to  schedule  slip,  cost  overrun,  or  technical  performance  parameters.  Interestingly,  the 
technical  performance  parameters  start  with  just  meeting  requirements  and  become 
progressively  poorer.  Also,  included  in  performance  risk  is  the  program’s  management. 
While  this  is  an  important  risk  factor  for  a  process  such  as  acquisition,  it  is  sometimes 
overlooked  when  evaluating  risks  to  a  system. 


Undesirable  consequence  increasing 
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3,  DoD  Risk  Mitigation  Planning 

Risk  mitigation  planning  is  the  process  of  evaluating  the  results  of  the  risk 
analysis  and  deciding  on  which  risks  need  to  be  mitigated,  how  and  when  mitigation 
should  be  accomplished,  and  who  is  responsible.  Risks  that  fall  into  the  red  blocks  of  the 
risk  reporting  matrix  (in  Figure  13)  naturally  have  priority  over  risks  that  fall  in  the 
yellow  region.  Those  risks  in  the  green  region  most  likely  are  considered  to  be  acceptable 
risks,  and  no  mitigation  efforts  are  required  up  front.  However,  just  because  these  risks 
are  acceptable  now,  they  still  need  to  be  tracked  to  ensure  their  likelihood  or  their  impact 
does  not  change  over  the  course  of  the  program  or  change  due  to  outside  influencers. 
Once  the  risks  have  been  prioritized,  planning  turns  to  ascertaining  how  to  mitigate  the 
most  risky.  The  means  of  mitigation  must  be  balanced  against  the  costs  of  mitigation,  not 
only  the  cost  during  acquisition  and  development,  but  also  what  impact  a  mitigation 
action  has  over  the  life  cycle  costs  of  the  program  or  system.  Mitigation  actions  might 
have  implications  on  the  technology  required  to  mitigate  the  risk,  and  of  course,  the 
mitigation  strategy  has  to  examine  and  balance  the  opportunity  cost  that  a  mitigation 
strategy  may  have  on  system  functionality  (DoD  RMG,  2006). 

4.  DoD  Risk  Mitigation  Plan  Implementation 

Implementation  is  the  process  of  putting  the  plan  developed  above  into  action. 
Implementation  is  the  management  function  of  communicating  the  plan  to  both  action 
personnel  and  to  the  selected  stakeholders  who  have  a  vested  interest  in  the  resulting  new 
system  requirements  if  the  plan  is  implemented.  Implementation  is  also  the  management 
function  of  assigning  mitigation  action  responsibilities,  and  inspecting  the  progress  and 
results  of  implementation.  Finally,  the  implementation  requires  some  type  of  reporting 
activity  to  keep  program  management  aware  of  changes  to  a  program  and  the  effects 
those  changes  have  on  the  program’s  cost  or  schedule.  Justification  for  program  cost 
increases  or  schedule  changes  can  best  be  documented  by  relating  them  to  the  risks  being 
mitigated  and  the  consequence  of  not  spending  the  time  or  money  to  mitigate  an 
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identified  risk.  It  also  doeuments  any  ehanges  to  the  performanee  parameters  of  the 
ultimate  program’s  produet  because  of  changes  to  the  system  to  mitigate  a  risk  (DoD 
RMG,  2006). 


5.  Risk  Tracking 

As  with  any  good  management  plan,  after  action  is  decided  upon  and  taken,  it  is 
good  practice  to  monitor  the  program  or  the  system  to  see  the  result  of  the  mitigation 
efforts.  If  the  results  are  not  what  were  expected,  or  conditions  in  the  environment 
affecting  program  performance  change,  further  actions  or  a  change  to  current  actions  may 
be  in  order  to  keep  the  level  of  risk  low,  or  to  meet  program  objectives.  The  balance  of 
the  benefit  of  mitigating  an  identified  risk  must  be  weighed  against  the  total  cost  of 
implementation  and  proper  tracking  of  the  results  of  mitigation  actions  supports  or  refutes 
the  decisions  made  so  that  any  non-working  decisions  can  be  changed.  As  the  graphic  of 
the  process  clearly  indicates,  risk  management  is  not  a  once-through  process.  Tracking 
inevitably  leads  to  the  identification  of  other  risks  not  uncovered  in  the  first  iteration  of 
the  process.  Based  on  good  systems  engineering  principles,  the  process  is  repeatable,  in 
the  case  of  this  directive,  throughout  the  acquisition  cycle  (DoD  RMG,  2006). 

E.  SUMMARY 

1.  Benefits  of  the  Risk  Management  Process 

The  rigorous  application  of  a  risk  management  process  is  an  important  weapon  in 
the  program  manager’s  arsenal  supporting  activities  and  decisions  leading  to  a  successful 
program  and  a  capable  product.  For  instance,  had  the  Future  Combat  Systems  (FCS) 
program  been  able  to  quantify  the  risk  that  the  integration  of  new  networking 
technologies  across  such  a  wide  array  of  hardware  systems  would  have  posed,  the 
program  directors  might  have  been  able  to  mitigate  that  risk  by  partitioning  the  disparate 
systems  networks  into  manageable  pieces.  The  consequences  of  that  risk  unmitigated  are 
apparent  in  the  fragmented  status  of  the  program  today.  As  is  evident  in  the  FCS 
program,  one  of  a  program  manager’s  important  risk  factors  is  the  political  environment 
surrounding  a  program,  and  equally  important  is  the  necessity  to  examine  the  level  of  risk 

contributed  by  that  factor  continually  as  the  political  environment  changes. 
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2,  Relating  the  Process  to  the  Network  Enterprise 

The  general  methodology  described  above  translates  well  to  the  analysis  of  risk 
and  the  implementation  of  mitigation  strategies  for  information  systems  and  computer 
networks.  The  risk  process  described  above  is  related  to  programmatic  risk  in  the  Defense 
acquisition  community,  but  as  is  shown  in  the  next  chapter,  the  basic  steps  for  applying 
the  process  to  programmatic  risk  are  equally  applicable  to  operational  and  enterprise  risk 
encountered  in  local  and  enterprise  networked  systems.  The  basics  of  risk  determination, 
probability  of  an  event  times  the  impact  of  that  event,  still  hold  true  for  the  assessment  of 
risk  to  a  network. 

3,  Relating  Benefits  to  Costs 

To  be  able  to  mitigate  risk  to  the  operations  of  a  local  network  or  an  enterprise 
system  of  systems  and  achieve  operational  or  strategic  goals,  identified  risks  are  assessed 
to  make  the  decision  whether  it  is  worth  the  cost  in  funding  or  opportunity  to  plan  and 
implement  a  mitigation  strategy  for  that  risk.  The  answer  to  the  question  “What  is  this 
mitigation  strategy  protecting?”  directly  affects  the  mitigation  strategy  employed.  For 
risks  of  little  or  no  impact,  no  matter  the  likelihood  of  occurrence,  the  mitigation  strategy 
may  be  one  of  accepting  the  risk  as  is.  Implementation  of  security  requirements  that 
restrict  functionality  of  a  network  incur  monetary  as  well  as  opportunity  costs,  life  cycle 
costs,  and  some  hidden  or  latent  costs  (such  as  stakeholder  costs  in  the  future).  If  the 
protected  system  and  its  information  have  little  impact  on  the  success  of  achieving  the 
desired  effect,  it  might  be  prudent  to  reallocate  that  funding  and  technical  solution 
elsewhere. 
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IV.  ENTERPRISE  NETWORK  RISK  MANAGEMENT 


A.  INTRODUCTION 

Risk  and  the  process  of  risk  management  as  it  applies  to  computer  information 
systems  and  network  enterprises  is  examined  in  detail  in  this  section.  While  the  basic 
definition  of  risk  applies  to  networked  systems  and  the  management  process  framework 
is  similar,  information  systems  have  unique  attributes  requiring  a  slightly  different 
perspective  on  how  to  determine  the  level  of  risk  to  those  systems.  By  connecting 
information  systems  into  a  network  capable  of  sharing  and  collaboration,  the  number  of 
possible  failure  paths  and  initiation  points  for  intrusion  to  cause  harm  increases  rapidly. 

An  analysis  of  the  reliability  of  a  network  is  important  to  the  designers  and 
architects  of  the  system  and  is  worthy  of  further  study.  To  be  thorough,  the  assessment  of 
risk  must  integrate  the  risks  and  hazards  associated  with  unreliability  of  equipment, 
accidental  failures  caused  by  equipment  or  people,  and  sabotage  or  attacks  to  obtain  a 
true  picture  of  the  system’s  capability  to  support  operational  and  strategic  objectives.  As 
reliability  and  human  factors  engineering  generate  key  design  criteria  for  a  system 
designer,  and  since  the  contention  is  that  risk  covers  all  the  categories  above,  the  risk 
management  process  should  include  all  those  areas  above  and  should  be  integrated  with 
the  determination  of  key  performance  parameters  at  the  beginning  of  a  system’s  life 
cycle,  during  concept  definition  and  possibly  before. 

Information  systems  and  network  enterprises  are  at  risk  due  to  the  following. 

•  Failure  of  hardware  components,  software  faults  and  bugs,  and  network 
communication  and  signal  failures 

•  Incorrect  design,  installation,  arrangement  of  the  interconnections  between 
nodes  on  a  network 

•  Human  error  by  accident  or  faulty  procedures 

•  Intentional  invasion  and  destruction  of  hardware,  software  program  logic 
and  functionality,  and  network  connectivity  by  entities  intent  on  inhibiting 
correct  network  operation  looking  to  steal  or  corrupt  information 
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This  thesis  concentrates  on  examining  risks  to  a  network  from  the  fourth  source  of 
risks  listed  above,  intentional  intrusion  into  a  network  with  the  purpose  of  causing  harm. 
Recognizing  and  dealing  with  intentional  acts  designed  to  disable  a  network  enterprise  is 
challenging  to  understand  fully.  These  challenges  continue  to  grow  as  technology 
improves  and  gives  the  adversary  more  tools  with  which  to  intrude  on  network 
operations.  However,  the  challenge  must  be  confronted  because  the  threat  of  network 
intrusions  and  disruptions  continues  to  increase  as  the  military  relies  more  on  networked 
systems  to  provide  the  edge  in  operations  and  strategy.  The  allocation  of  scarce  monetary 
and  human  resources  without  a  management  plan  is  misdirected  and  may  impede  the 
purpose  of  the  network  in  the  first  place.  Installing  technical  network  countermeasures  or 
writing  and  enforcing  procedures  that  affect  the  information  flows  and  availability  on  the 
network  supporting  operations  and  strategy  are  without  clear  benefit  and  without  robust 
risk  management  planning  and  implementation.  Equally  disastrous  is  when  the  risks  are 
ignored  or  overlooked,  allowing  the  adversary  freedom  to  intrude  and  disrupt  vital 
network  functionality  just  at  the  time  when  it  is  most  needed. 

B,  RISKS  IN  THE  NETWORK  ENVIRONMENT 

The  definition  of  the  level  of  risk  to  computers  and  information  systems  from 
intentional  intrusion  can  be  defined  in  general  terms  as  a  function  of  the  level  of  threat, 
the  vulnerability  of  the  information  system  and  network,  and  the  value  of  the  system  and 
information  assets  (Jones  &  Ashenden,  2005).  Or  put  another  way,  risk  is  the  result  of  a 
threat  with  adverse  effects  on  a  vulnerable  system  (Chittister  &  Haimes,  2006,  p.  5). 
When  no  vulnerability  exists  to  exploit,  there  is  no  probability  that  a  threat  damages 
network  assets,  and  when  a  network’s  assets  are  not  worth  protecting,  the  measure  of  risk 
is  low  no  matter  what  the  probability  is  of  a  threat  exploiting  network  vulnerabilities. 
More  succinctly,  risk  from  a  threat  to  a  computer  network  can  be  defined  in  the  following 
equations  as: 
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Risk  =  Probability  of  Attack  x  Impact  of  Attack  (1) 

Probability  of  Attack  =  Probability  that  Threat  will  Exploit  Vulnerability  (2) 

Threat  Probability  =  Threat  Motivation  and  Threat  Teehnieal  Capability  (3) 

Threat  Motivation  =  Enough  Resources,  Eikelihood  of  Attaek  Sueeess  (4) 

Etkelihood  of  Attaek  Sueeess  =  Eikelihood  of  Avoiding  Deteetion  and  Impaet  (5) 
(Ingoldby,  2009). 

1.  Safety  Risk 


Safety  risk  to  a  network  is  most  often  identified  with  networks  that  provide 
supervisory  eontrol  and  data  aequisition,  eommonly  called  SCADA  systems.  SCADA 
systems  are  usually  critieal  industrial,  mechanieal  or  eleetrical  equipment  operational 
eontrol  networks  to  monitor  hardware  system  parameters  and  provide  eontrol  signals  to 
adjust  inputs  to  get  desired  outputs.  SCADA  systems  are  networked  to  provide  system 
managers  information  to  change  meehanical  or  electrieal  parameters  and  ehange  system 
outputs.  These  networks  are  usually  isolated  and  bounded  by  the  industrial  system  they 
are  controlling.  However,  if  the  network  extends  to  a  multipurpose  eomputer  that  has 
aeeess  to  the  Internet,  exploitation  of  these  eritieal  systems  is  possible  from  threats 
outside  the  system’s  boundary.  Eor  instanee,  if  a  eontrol  system  on  a  ship  supplies 
information  to  a  system  manager’s  eomputer  also  used  for  unelassified  networking  to 
other  information  systems  off  the  ship,  the  eontrol  system  may  be  vulnerable  to  attaek 
from  outside  the  ship.  Sinee  exploitation  of  a  eomputer  network  vulnerability  ean  eome 
from  a  variety  of  sourees,  it  is  important  to  know  if  that  adversity  is  from  an  intended  aet, 
an  aeeident,  or  from  a  near  miss  (ineident),  as  this  information  supports  the  deeison- 
making  proeess  as  it  is  applied  to  the  management  of  risk. 

2,  Operational  Risk 

Operational  risk  is  the  threat  to  networked  systems  that  provide  information  and 
funetionality  to  aeeomplish  a  mission.  Many  aetivities  on  a  network  ean  be  ineluded  in 
this  definition.  Civilian  and  military  organizations  want  to  aehieve  eertain  effeets  by  the 
aetions  they  perform,  and  the  organization  has  short-term  goals,  whieh  measure  the 
effects  achieved.  Any  network  system  eontributing  to  the  aeeomplishment  of  those  goals 
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is  at  operational  risk  of  failing  to  support  the  achievement  of  said  goals.  For  instance,  the 
network  of  UAV  sensors  to  ground  control  stations  that  may  be  directly  connected 
through  a  network  to  war  fighters  in  the  field  that  depends  on  the  UAV  intelligence  and 
surveillance  for  targeting  is  at  risk  operationally  as  long  as  a  way  exists  to  exploit  a 
vulnerability  in  the  “sensor  to  shooter”  network.  Operational  risk  includes  threats  to  a 
network  that  inhibit  the  network  from  achieving  the  desired  effects  supporting  the  goals 
of  a  unit  or  organization  to  which  it  belongs.  Operational  risk  is  a  broader  view  of  risk  to 
a  network  than  safety  and  program  risk  and  must  consider  the  integration  of  the  people, 
processes  and  systems  used  in  the  attainment  of  those  enterprise  objectives  and  the 
external  forces  that  would  prohibit  obtainment  of  those  objectives  (Kujawski,  2009). 

3.  Enterprise  Risk 

The  aggregation  of  operational  risks  to  a  system  or  family  of  network  systems  can 
result  in  enterprise  risk,  which  is  the  threat  to  the  infrastructure  or  the  long-term  goals  of 
an  organization.  It  is  important  to  understand  that  risk  to  an  enterprise  network  is 
wrapped  up  in  the  nature  of  the  integration  and  interoperability  of  the  network’s 
components  and  sub-networks  connected  together.  Enterprise  risk  is  different  than 
operational  risk  by  the  organizations’  assets  and  ideals  that  are  possibly  threatened  and  by 
the  fact  that  the  networks  threatened  by  enterprise  risk  are  most  likely  virtually 
unbounded  much  like  the  Internet.  Unbounded  networks  were  defined  in  Chapter  I  as  a 
network  where  no  one  entity  can  know  who  is  connected  to  the  network  at  any  one  time 
or  what  connections  are  active  in  the  vast  array  of  paths  between  nodes  that  exist  in  the 
system.  The  action  of  exploitation  in  network  enterprises  usually  takes  place  at  the 
interfaces,  and  the  network  response  to  deterring  or  responding  to  an  attack  can  depend 
on  whether  the  systems  at  the  interface  are  loosely  or  tightly  coupled.  Tight  coupling 
offers  a  greater  amount  of  control  over  the  entire  system,  whereas  loose  coupling  offers 
system  resilience  by  containing  the  “infection”  of  an  attack  to  one  of  the  coupled  systems 
and  limiting  the  spread  of  the  attack. 
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In  a  network  enterprise,  the  ultimate  goal  is  to  aehieve  strategie  information 
superiority.  Enterprise  risk  is  the  probability  that  some  threat  agent  has  the  desire  and  the 
eapability  to  exploit  the  network  enterprise,  redueing  or  eliminating  the  information 
superiority  of  the  enterprise,  resulting  in  a  eonsequenee  of  varying  severity  to  the 
enterprise’s  ability  to  aehieve  its  strategie  goal.  As  enterprise  risk  takes  on  a  holistie 
perspeetive,  so  must  the  element  of  threat  to  the  enterprise.  The  environment  surrounding 
the  network  enterprise  eontains  threats  from  other  national  interests,  rogue  players, 
evolving  teehnologies,  and  internal  people  proeesses  and  polieies  (Kujawski,  2009). 

C.  RISK  CONTROL  IN  A  NETWORK 

1.  Balancing  Security  with  Functionality  is  a  Team  Effort 

Management  of  risk  requires  the  applieation  of  risk  eontrols  on  a  system.  Controls 
of  risk  to  a  program  or  system  usually  fall  under  the  name  of  seeurity.  The  method  of 
eontrolling  risk  in  any  information  system,  and  espeeially  in  the  U.S.  military’s  large 
inventory  of  information  systems  as  they  are  networked  together,  is  a  balaneing  aet 
between  seeurity  teehnology  and  proeedural  implementation  by  system  administrators 
and  the  desire  for  funetionality  by  the  users.  The  system  administrators  are  eharged  with 
the  seeurity  of  the  system,  while  providing  usability  that  meets  the  needs  of  the  mission. 
To  aeeomplish  this  effeetively,  a  dialogue  between  users  and  system  administrators  must 
be  established  so  that  seeurity  seheme  and  strategies  ean  be  aligned  with  the  needs  of  the 
users,  and  the  users  must  be  trained  in  the  network  system  vulnerabilities  and  how  the 
vulnerabilities  are  being  mitigated  to  lower  the  level  of  risk  to  the  network.  Without  this 
dialogue,  dealing  with  risk  ineludes  risk  transfer  or  avoidanee,  whieh  may  unneeessarily 
limit  the  network  funetionality. 

2,  Controlling  Risk  is  an  Evolutionary  Process  Requiring  Several 
Iterations 

The  eontrols  put  in  plaee  today  do  not  neeessarily  guarantee  effeetiveness 
tomorrow.  As  the  threat  ehanges  in  eapability  and  motivation,  and  as  the  network  grows 
in  size  and  ehanges  in  teehnieal  design,  a  robust  risk  management  plan  requires  the 
eonstant  evaluation  of  the  risk  eontrols  in  plaee  and  the  requirement  to  ehange  or  improve 
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on  their  capability  to  lower  the  level  of  risk.  Table  2  (Mulokey,  2009,  p.  27)  outlines 
some  of  the  effects  evolutionary  change  has  on  the  security  design  and  procedural 
implementation  on  network  systems. 


Evolutionary  Changes 

Effects  on  System  Security 

The  user’s  toleranee  for  risk  may  ehange  in 
response  to  ehanging  world  eonditions,  politieal 
eonsiderations,  or  business  priorities. 

Stakeholders  may  require  inereased  levels  of 
assuranee  requiring  tighter  eontrols.  Conversely,  the 
greater  need  for  the  system’s  outputs  may  justify 
higher  risk. 

System  eomponents  beeome  obsolete. 

Commereial  off-the-shelf  produets  no  longer 
supported  by  vendors  beeome  vulnerable  to  attaek 

External  sourees  ean  ehange  the  eharaeteristies  of 
their  inputs  to  the  system. 

Anomalous  responses  to  the  ehanged  inputs  ean 
degrade  the  system’s  availability  or  beeome  an 
attaek  veetor  for  sophistieated  haekers. 

Advanees  in  teehnology  instigate  system 
modifieations  to  improve  effieieney. 

Design  ehanges  may  inerease  seeurity  or  provide 
new  vulnerabilities.  Seeurity  analysis,  design 
reeommendations,  and  testing  are  key  to  assure 
seeurity  performanee. 

Workforee  eharaeteristies  ean  ehange  due  to  the 
retirement  of  experieneed  personnel. 

Undoeumented  proeedures  ean  be  lost  when 
experieneed  personnel  are  replaeed.  Improved 
proeess  doeumentation  is  needed  to  support  training 
new  personnel. 

Risks  to  a  network  are  always  changing  and  must  be  reassessed  frequently. 

Table  2.  Effects  on  Network  Security  due  to  Evolutionary  Change  (Erom:  Mulokey, 
2009) 

D.  RISK  MANAGEMENT  OF  A  GLOBAL  NETWORK  ENTERPRISE 

1.  Complexity  of  Network  Risk  Management  for  an  Enterprise  System 

Eor  an  enterprise  as  complex  and  with  such  a  breadth  of  coverage  as  the  current 
and  the  target  GIG,  a  process  of  risk  management  is  a  huge  undertaking.  The  individual 
services’  system  of  systems  as  exemplified  in  the  Navy’s  EORCENet,  Air  Force’s 
Challenger,  and  the  Army’s  Future  Combat  Systems  are  in  and  of  themselves  expansive 
networks  involving  multiple  physical  infrastructures  of  connectivity  and  computational 
power,  a  vast  array  of  software  programs  designed  to  control,  operate  and  manage 
complex  weapons  platforms  and  inform,  train,  and  command  many  thousands  of 
specialized  soldiers  and  sailors.  Developing  a  system  for  risk  management  for  the  service 
components’  information  systems  individually  is  problematic  as  the  eclectic  mix  of 
current  systems  and  their  connectivity  through  thousands  of  different  networks  makes  the 
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assessment  of  risk  especially  difficult.  As  is  explored  later,  several  studies  on  risk 
assessment  have  been  or  are  being  conducted  on  assessing  risk  within  individual  network 
operations.  However,  in  the  future,  these  individual  networks  are  tied  together  so  that  war 
fighters  can  integrate  intelligence  and  targeting  data  from  selected  sources  to  the  weapons 
systems  they  are  operating  along  with  the  decision  support  systems  that  need  to  be 
available  to  achieve  success  in  the  mission.  It  is  the  contention  of  this  paper  that  the  U.S. 
military  needs  a  workable  and  understandable  risk  management  system  with  the  rigor  to 
be  comprehensive,  the  structure  to  be  consistent,  and  the  flexibility  to  adapt  to  a  changing 
environment.  The  process  must  be  able  to  identify  and  assess  as  many  risks  as  possible 
across  the  enterprise  of  the  service  components  and  across  the  DoD  infrastructure  of  the 
target  GIG,  and  it  must  be  a  continuous  process,  which  monitors  the  state  of  the  GIG 
network  and  feeds  back  information  to  improve  the  process  with  every  iteration.  A 
comprehensive  risk  management  system  is  designed  to  handle  an  eclectic  mix  of 
numerous  network  components  and  their  interconnection  and  interfaces,  and  it  is 
designed  to  handle  large  amounts  of  data  required  to  make  a  trustworthy  assessment  of 
the  systems  risks  so  that  decisions  made  to  counter  the  threats  are  well  informed  and 
effective,  contributing  to  the  success  of  net-centric  operations.  The  process  is  able  to  keep 
the  management  plan  current  with  changing  technologies,  new  threats,  and  up-dated 
military,  political,  economic  and  diplomatic  strategies  of  the  government.  It  is  shown  that 
this  enterprise  management  system  must  take  a  top-down  view  as  well  as  a  bottom-up 
view,  decomposing  the  risks  across  the  myriad  of  networks,  each  of  which  requires 
special  sets  of  values;  then  integrating  the  risk  process  across  the  many  networks  to 
establish  a  truly  enterprise  risk  management  system.  This  is  especially  important  because 
DoD  and  the  individual  services  have  chosen  to  architect  their  networks  in  accordance 
with  Service  Oriented  Architecture  as  discussed  in  Chapter  II. 

2,  Beginning  the  Process  Early  in  a  Network  System’s  Lifecycle 

Viewing  the  networked  system  of  systems  from  the  top-level  down,  which 
comprise  the  infrastructure  of  the  GIG,  risk  management  of  the  network  enterprise  must 
be  an  overarching  process,  which  includes  and  encompasses  the  risk  management  of  the 

infrastructure’s  individual  network  systems.  The  challenge  in  a  network  of  this  size  and 
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complexity  is  to  mesh  the  overall  risk  management  plan  with  the  individual  system’s  risk 
management  plans,  so  that  there  is  a  unity  of  effort  and  a  eommon  set  of  objeetives;  and 
the  most  advantageous  time  in  the  system’s  lifeeyele  to  start  this  is  from  the  beginning 
during  eoneept  definition,  whieh  draws  on  the  network  system’s  eoneept  of  operations.  If 
the  risks  to  the  network  enterprise  are  not  identified  and  assessed  with  the  start  of  a 
management  plan  in  plaee  before  the  requirements  and  speeifieations  are  defined  and 
alloeated  to  system  eomponents,  the  arehiteeting  of  the  system  during  preliminary  design 
does  not  eonsider  how  to  arehiteet  the  system  to  reduee  or  eliminate  vulnerabilities  that 
eould  be  exploited,  threatening  eritieal  data  and  system  operation.  In  essenee,  the  need  for 
operational  eapability  eompetes  with  the  need  for  seeurity;  the  resultant  arehiteeture 
being  reaetive  to  the  threats  as  they  are  eneountered  during  the  later  stages  of  the 
network’s  lifeoyele. 

E.  PROPOSED  NETWORK  ENTERPRISE  RISK  MANAGEMENT 

PROCESS  (NERMP) 

1.  Review  of  Available  Software  Risk  Management  Processes 

The  Computer  Emergeney  Response  Team  (CERT)  and  the  CERT  Coordination 
Center  of  the  Software  Engineering  Institute  of  Carnegie-Mellon  University  has  been 
supporting  network  seeurity  for  DoD  for  several  years.  Their  model  (Carelli  &  Young, 
2008)  for  analyzing  risk  is  a  multi-dimensional  model,  whieh  eneompasses  the  following. 

•  Ineident  response  risk  model  and  assessment 

•  Software  proeess  risk  model  and  assessment 

•  Operational  seeurity  risk  model  and  assessment 

•  Other  risk  models 

cert’s  elassie  model  OCTAVE  for  operational  seeurity  of  eomputers  and 
networks  has  the  following  steps. 

•  Identify 

•  Analyze 

•  Plan 

•  Implement 

•  Monitor 
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•  Control 

•  Repeat 

This  process  is  similar  to  the  program  risk  management  of  the  DoD  acquisition 
program,  and  many  of  the  activities  of  OCTAVE,  surveys,  workshops,  questioimaires, 
and  auditing  mirror  those  of  the  program  risk  management  requirements  and  activities  of 
the  DoD  risk  management  guide  for  acquisition. 


2,  NERMP  Details 

Drawing  on  the  process  model  from  CERT,  DoD  Risk  Management  Guide  for 
Acquisition,  lectures  and  articles  by  Yacov  Haimes,  and  course  materials  for  enterprise 
risk  management  (Kujawski,  2009),  Eigure  14  graphically  displays  a  process  for  risk 
management,  which  is  discussed  in  detail  below  the  graphic. 

As  the  network  grows  and  the  environment  changes,  process  repeats 


NERMP  is  similar  to  the  process  of  the  last  chapter  but  is  designed  for  network 
enterprises. 


Eigure  14.  Network  Enterprise  Risk  Management  Process  (NERMP) 
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The  Risk  Analysis  Process  is  a  method  to  identify  the  components  of  risk  as  stated 
above  systematically  and  assess  the  risks  identified  to  determine  what  could  go  wrong 
and  assess  the  probability  so  that  risks  can  be  prioritized.  The  Risk  Management  Process 
continues  the  process  once  risks  have  been  analyzed.  It  is  then  necessary  to  manage  the 
risks  by  determining  what  can  be  done  about  matching  available  resources  against  risk 
implementation  plans,  and  controlling  and  monitoring  the  network  for  the  effectiveness 
of  the  risk  management  decisions  as  well  as  determining  the  effectiveness  of  survivability 
and  resiliency  measures  for  system  recovery.  Each  of  the  components  of  risk,  assets, 
threats,  vulnerabilities  (Figure  14,  column  1),  recovery  and  resiliency  (Figure  14  column 
4),  are  examined  briefly  in  more  detail  below  for  their  contribution  to  the  risk  analysis 
process  as  it  applies  to  a  network  enterprise. 

•  Asset  valuation  is  an  important  and  complex  part  of  the  process.  The  value 
of  network  assets  is  more  than  just  the  cost  of  hardware,  software  and 
operational  costs  of  the  network.  A  thorough  valuation  must  take  into 
account  the  assets  of  information  and  functionality  provided  by  the 
network  and  the  value  of  these  assets  to  achieving  the  organization’s 
goals.  This  is  a  moving  target,  though.  Information  in  the  military  is  often 
transitory,  valuable  for  the  moment  or  the  situation  at  hand.  The  effects 
that  the  organization  is  trying  to  achieve  to  meet  goals  is  usually  longer 
term,  but  even  those  change  with  changes  in  the  environment  surrounding 
the  network  enterprise.  However,  an  honest  valuation  of  assets  in  the 
identify  phase  supports  the  assessment  of  the  impact  of  an  attack  on  the 
network  and  where  to  place  priority  on  the  choices  to  handle  and  monitor 
the  identified  risks. 

•  Threat  identification,  assessment  and  management  (such  as  it  can  be  done) 
is  equally  as  complex  since  the  probability  of  an  attack  from  a  given  threat 
agent  is  difficult  to  quantify  statistically.  Unlike  reliability,  a  threat  makes 
an  attack  based  on  the  threat’s  subjective  analysis  of  its  likelihood  of 
success  in  not  being  detected  while  doing  the  maximum  amount  of 
damage. 

•  Vulnerability  of  the  network  is  the  one  risk  component  over  which  there  is 
some  control.  However,  in  the  age  of  COTS  and  SOA,  identifying,  let 
alone  assessing  and  managing,  the  vast  amount  of  ways  to  exploit  a 
network  enterprise  as  expansive  and  technologically  complex  as  the  GIG, 
can  be  overwhelming  in  itself.  Several  sources  (Clark,  Sollins, 
Wroclawski,  &  Faber,  2003;  Chittister  &  Haimes,  2006:  Haimes,  2009; 
Jones  &  Ashenden,  2005;  McCabe,  2009;  Storey,  1996)  have 
approximated  that  in  an  individual  computer  operating  system,  there  are 
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over  200  million  lines  of  eode,  and  out  of  those,  two  million  have  “bugs” 
(defeets).  The  ehallenge  is  to  determine  what  bugs  are  exploitable  and 
what  to  do  about  them.  Other  vulnerabilities  are  explored  in  depth  below. 

•  System  reeovery  and  resilieney  are  inereasingly  important  to  a 
eomprehensive  risk  analysis.  For  the  attaeks  that  do  get  through,  it  is 
important  to  know  how  the  network  reeovers,  what  information  needs  to 
be  sustained  during  an  attaek  and  how  quiekly  the  network  ean  get  baek  to 
full  eapability  after  an  attaek.  This  analysis  has  to  be  eondueted  before  an 
attaek  to  make  tradeoffs  and  deeisions  about  what  reeoverability  features 
need  to  be  part  of  the  network,  sueh  as  redundaney,  replieation,  diversity, 
and  distribution.  Survivability  oharaeteristies  of  reeognize,  resist  and 
reeover,  are  part  of  the  requirements  that  eomprise  the  other  funetionality 
performanee  parameters  of  a  network. 

The  proeess,  if  done  eorreetly,  establishes  priorities  and  allows  for  the  seleetion  of 
alternatives  on  whieh  to  make  deeisions  on  how  and  where  to  proteet  the  system.  Many 
of  these  deeisions  are  made  in  uneertain  eireumstanees,  and  the  deeisions  made  rely  on 
some  predietion  of  probabilities  and  of  eonsequenees.  Sinee  an  effeetive  risk 
management  program  must  start  in  the  first  phase  of  a  system’s/network’s  lifeeyele  and 
eontinued  throughout,  an  effeetive  risk  management  plan  guides  deeisions  on  how  to 
arehiteet  a  proteeted  yet  funetional  network  that  optimizes  the  eompeting  objeetives  of 
system  eapability  and  system  proteetion.  As  Yaeov  Haimes  says  of  risk  assessment  and 
management,  “The  risk  assessment  and  management  proeess  is  aimed  at  answering 
speeifie  questions  in  order  to  make  better  deeisions  under  uneertain  eonditions”  (Haimes, 
2009,  p.  22). 

At  the  eonelusion  of  the  proeess,  it  should  answer  some  of  the  following 
questions. 

•  What  needs  to  be  proteeted  and  why 

•  What  is  being  proteeting  against 

•  How  mueh  proteetion  is  required 

•  How  does  the  proteetion  inhibit  operation  of  the  system  and  affeet  the 
desired  outeome  or  ability  to  aehieve  the  system’s  objeetives 

•  What  proteetion  is  it  possible  to  do  without  to  make  the  system  more 
funetional 

•  What  system  funetionality  ean  be  forfeited  beeause  of  the  unknown  threat 
or  vulnerability 
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•  How  is  the  protection  to  be  implemented 

•  How  much  does  the  protection  cost  (to  implement,  in  opportunity  cost,  in 
training  and  monitoring) 

F.  IDENTIFYING,  EVALUATING  AND  MEASURING  THE  ELEMENTS  OF 

RISK  ANALYSIS  AND  ASSESSMENT 

I,  Asset  Valuation 

Appraisal  of  an  asset  is  important  in  that  its  value  depends  on  the  end  product  or 
mission  that  asset  is  designed  to  achieve.  For  instance,  given  a  similar  information  system 

in  two  organizations  with  similar  vulnerabilities  in  that 
system,  an  adversary  who  successfully  exploits  that 
vulnerability  causes  greater  damage  to  the  organization  with 
the  more  critical  mission.  As  with  any  system  design, 
development  or  operation,  the  engineer  needs  to  know  the 
systems  requirements,  how  those  requirements  are  going  to  be  or  are  being  met  by  the 
system  architecture,  and  if  the  functionality  of  the  system  is  designed  to  achieve  the 
objectives  of  all  the  stakeholders.  An  assessment  of  risk  must  also  start  from  the  vantage 
point  of  a  thorough  knowledge  of  the  system  under  assessment.  A  detailed  system 
definition  comprising  the  system’s  requirements,  specifications,  and  objectives  is 
required  before  an  assessment  of  the  system’s  vulnerability’s  is  determined.  In  several 
studies  on  risk  assessment  of  networked  computer  systems,  and  especially  in  Storey’s 
analysis  of  safety-critical  computer  systems,  the  first  step  in  the  study  of  system  risk  is  to 
discover  asset  value  (Jones  &  Ashenden,  2005). 

Placing  value  on  network  assets  is  not  just  about  the  dollar  value  of  the 
information  asset  exploited,  but  also  about  the  consequences  that  exploitation  has  on  the 
organization  in  terms  of  missed  opportunity  or  damage  to  critical  data,  people  or 
infrastructure.  Not  only  is  the  consequence  of  any  given  exploitation  important,  but  so  is 
the  interrelationship  of  the  value  of  that  system  to  the  value  of  a  connected  system  and 
the  cascading  effects  that  an  intrusion  and  attack  on  one  system  may  have  on  another.  In 
the  DoD  enterprise,  this  process  is  problematic  because  of  the  large  inventory  and  wide 
diversity  of  networked  information  systems  now  used  for  a  host  of  capabilities 
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throughout  the  services.  Add  onto  this,  the  direction  of  the  new  architectural  framework 
in  the  form  of  service  oriented  architecture,  and  the  effect  that  one  system’s  degraded 
condition  can  have  on  the  condition  and  capability  of  another  is  endless.  There  are  so 
many  systems  and  so  many  people  and  organizations  at  stake  with  opinions  as  to  whose 
system  is  more  important.  Additionally,  a  networked  system  takes  on  different  values 
depending  on  the  operation  for  which  it  is  used,  the  environment  in  which  it  is  operated, 
and  as  envisioned  by  FORCENet  and  the  GIG  in  general,  to  be  able  to  set  up  ad  hoc 
networks,  and  what  the  value  is  of  a  network  as  the  composition  of  that  network  changes 
to  achieve  a  temporal  objective.  The  military  has  employed  a  similar  system  of 
information  valuation  since  the  beginnings  of  safeguarding  classified  information;  that 
level  of  classification  has  been  determined  by  the  severity  of  its  loss  to  the  conduct  of 
U.S.  operations.  For  instance,  the  loss  of  top  secret  information  would  cause  grave 
consequences  to  U.S.  operations  and  to  U.S.  forces  should  it  fall  into  unauthorized  hands. 
This  same  idea,  but  on  a  far  larger  and  more  rigorous  and  robust  level,  is  needed  to  place 
a  value  on  the  multitude  of  networked  systems  in  the  service  today  and  planned  for  the 
future  in  the  GIG.  The  Navy  has  taken  the  initial  steps  in  this  type  of  classification  by  the 
development  of  Cross  Domain  Solutions  and  Multi-level  Security  (CDS/MLS)  processes 
within  U.S.  information  systems.  However,  with  the  fluid  nature  of  these  systems’ 
capabilities  as  they  relate  to  the  environment  and  the  national  security  situation,  and  as 
they  transform  with  the  addition  of  changing  technologies,  it  is  necessary  to  conduct  a 
top-down  analysis  of  system  value  both  as  a  stand-alone  system  and  in  the  context  of 
interoperability  (and  with  SOA)  and  collaboration  with  other  systems.  With  the  desired 
capability  of  creating  ad  hoc  networked  systems  to  meet  a  tailored  mission,  the  valuation 
of  individual  systems  changes  and  must  be  addressed  each  time  one  of  these  ad  hoc 
networks  is  created. 

Taking  a  mission-oriented  perspective,  Donald  Buckshaw  et  al.(Date?),  take  a 
back  door  approach  to  asset  valuation  by  modeling  a  value  hierarchy  on  the  adversary, 
the  user,  and  the  service  provider.  Instead  of  trying  to  optimize  the  system  under  a  model 
of  multiple  competing  objectives,  they  use  Value -Focused  Thinking  methodology 
introduced  by  Keeney  and  Raiffa.  Their  main  focus  on  assigning  value  was  to  an 
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adversary  model,  but  they  also  applied  this  methodology  to  a  valuation  of  assets  in  their 
user  and  service  provider  models  to  help  quantify  the  importance  of  assets  from  a 
mission-oriented  perspective,  which  then  leads  to  an  assessment  as  to  the  severity  of  the 
consequences  due  to  an  attack  on  a  networked  system.  Appendix  B  examines  this  model. 

2,  Vulnerability  Determinations 

Hand  in  hand  with  the  threat  assessment  piece  of  risk  assessment  is  a 
determination  of  a  system’s  vulnerability.  Without  a  vulnerability  element,  the  threat 

agent  cannot  harm  the  system,  and  without  a  threat,  the 
vulnerable  part  of  the  system  is  not  a  risk  (other  than  normal 
reliability  considerations).  The  interrelationship  of  threat  to 
vulnerability  is  captured  in  the  attack  tree  analysis  mentioned 
in  the  next  section.  The  path  an  attacker  might  take  to  reach  an 
objective  of  doing  harm  to  the  system  leads  into  an  analysis  of  the  scenario  that  points  to 
the  vulnerable  elements  of  the  system  and  what  components  of  hardware  or  routines  in 
the  software  might  be  vulnerable  and  require  attention.  This  is  done  routinely  in  the 
commercial  world  with  the  identification  of  a  path  or  scenario  an  attacker  might  take  and 
identification  of  what  components  are  exploitable.  For  instance,  in  2007,  it  was  found  that 
no  matter  how  carefully  a  network  topology  is  guarded  within  the  network,  it  is  possible 
for  an  intruder  to  gain  network  knowledge  while  packets  are  transiting  the  Internet 
through  a  process  called  reverse  Domain  Name  Server  (DNS)  look-up.  Transmission 
Control  Protocol  (TCP)  packets  were  captured  on  the  Internet,  which  gave  internal  IP 
address  information  to  an  unauthorized  source  (Faber,  2009).  There  are  ways  to  prevent 
this  with  the  configuration  of  the  network’s  DNS  servers  to  keep  reverse  look-ups 
internal  to  the  network,  but  this  vulnerability  is  illustrative  of  the  breadth  of  system 
vulnerabilities  and  the  huge  task  at  finding  and  fixing  them. 

Determining  the  vulnerability  of  a  system  is  a  very  complex  problem.  In 
networked  information  systems,  vulnerabilities  can  be  extant  in  system  software  (most 
visibility  these  days),  hardware,  firmware,  user  or  service  provider  personnel,  policies 
and  procedures,  or  in  the  common  practices  of  use  and  applicability.  In  the  software 
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sector  alone,  vast  libraries  have  been  compiled  to  document  and  catalogue  the  myriad  of 
vulnerable  components,  programs  and  paths  in  the  vast  array  of  computer  systems  and 
networks.  Vulnerability  can  show  up  at  any  level  of  the  system  from  the  physical  layer  to 
the  most  advanced  application.  In  addition,  because  of  the  complexity  and  diversity  of  the 
technology,  the  path  to  a  vulnerable  component  can  take  many  different  routes  depending 
on  the  instrument  or  method  used  to  penetrate  a  system. 

Vulnerabilities  are  a  natural  byproduct  of  the  quantity  of  systems  which  have  been 
developed  with  quality  that  is  less  than  perfect,  incomplete  reqirements,  and  limited 
testing.  The  higher  the  quality,  the  fewer  vulnerable  components,  but  it  is  now  more 
difficult  to  detect  the  remaining  software  flaws.  Quality  of  software  was  discussed  in 
Chapter  II,  with  the  discussion  of  CMMI;  a  process  for  quality  control  on  software 
development,  which  is  a  process  to  improve  the  level  of  software  quality  to  diminish  the 
suspected  software  faults  that  open  up  vulnerabilities  in  the  software  architecture. 
Software  and  systems  engineering  need  to  grow  in  scope  and  capability  because  even  as 
the  quality  of  software  products  improves,  other  additions  to  the  software  inventory  open 
up  new  paths  to  exploit  an  upgraded  system,  making  it  difficult  to  catch  up.  On  the 
software  side,  patches  have  been  used  in  commercial  and  military  enterprises  to  fix  or 
lessen  vulnerability  when  it  is  discovered.  It  is  by  far  easier  to  find  and  fix  vulnerability 
during  the  development  stage  of  a  system’s  life  cycle  than  when  the  system  has  already 
been  deployed.  However,  the  use  of  COTS  components  and  programs  makes  this 
problematic.  In  fact,  one  of  the  down  sides  to  open  architectural  development  and  reuse 
in  a  SOA  is  that  the  developer  may  have  little  if  any  insight  into  the  contents  of  a 
software  program  or  a  component  with  imbedded  software  and  its  vulnerabilities.  This  is 
especially  true  of  purchased  material  (programs  and  components)  design  architectures, 
which  are  proprietary.  Some  of  this  risk  can  be  mitigated  by  disabling  portions  of  COTS 
programs  not  being  used,  layering  additional  security  programs  on  top  of  the  proprietary 
program  (but  this  may  itself  create  more  vulnerabilities),  conduct  research  and 
development  into  technological  tools  to  analyze  the  COTS  coding  (this  is  an  expensive 
alternative  that  must  be  weighed  in  the  risk  assessment),  or  plan  on  a  response 
mechanism  or  procedure  for  fixing  flaws  (more  expensive  but  probably  necessary  is  the 
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capability  of  fixing  flaws  “on  the  fly”  to  improve  recoverability)  (Anderson  &  Hundley, 
1998).  COTS  use  in  building  systems  has  gained  wide  acceptance  within  the  military’s 
aequisition  eommunity  beeause  it  saves  on  system  development  eosts,  on  personnel 
training  eosts  (many  users  have  used  like  systems  in  the  commereial  world),  and  on  the 
costs  of  inventory  stocking  and  supply  chain  management  (although  some  control  is  lost 
when  relying  upon  a  vendor  for  spares,  and  when  dealing  with  multiple  vendors,  the 
problem  can  multiply  rapidly).  While  the  eost  advantages  of  using  an  open  architecture 
approaeh  to  design  of  networked  systems  looks  appealing,  the  remedies  eited  above  ean 
come  with  their  own  enormous  price.  Neil  Storey  (1996)  notes  in  his  book,  Safety- 
Critical  Computer  Systems,  that  exhaustively  testing  a  piece  of  software  can  be  time 
consuming  and  costly  to  find  all  the  vulnerabilities  (what  he  calls  exhaustive  testing)  in 
just  one  pieee  of  a  program.  The  proeedure  must  test  for  all  possible  binary  inputs  to  a 
system  against  their  output  (black  box  testing  when  the  code  is  unknown  as  in  proprietary 
software)  .He  points  out  that”  With  a  small  program  of  40  inputs,  the  test  involves  the 
input  and  measurement  of  output  of  10  eombinations.  When  the  subsystem  components 
are  known,  a  eheck  of  eaeh  component  as  its  state  changes  (binary  patterns)  can  be 
eonducted,  but  in  a  simple  8  byte  mieroprocessor,  exhaustive  tests  of  all  combinations  of 
states  will  be  and  to  only  look  at  the  eombination  of  failures  due  to  a  bridging 

fault  on  just  three  nodes,  would  require  10  eombinations.” 

Obviously,  to  test  a  component  of  COTS  software  for  vulnerabilities,  sampling 
techniques  must  be  used  (what  Storey  calls  coverage-based  testing)  to  develop  statistical 
measurements,  whieh  gains  a  confidence  level  that  a  system’s  vulnerabilities  are  limited. 
Still,  many  vulnerabilities  may  and  probably  do  exist. 

One  of  the  difficulties  in  looking  for  vulnerability  when  conducting  the  risk 
assessment  proeess  is  that  as  teehnology  improves  at  a  rapid  pace,  the  risk  management 
methodologies  are  slow  to  reaet  to  incorporate  the  new  technology  and  to  identify  the 
new  vulnerabilities  ereated  by  the  new  technology.  The  challenge  is  to  maintain  a 
vulnerability  that  is  aeeessible  and  current  with  the  range  of  vulnerabilities  to  the 
system’s  network,  the  various  operating  systems  on  that  network,  and  the  ever  expanding 
quantity  of  applieations  in  each  information  system,  and  the  connectivity  protocols  used 
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for  interoperability.  The  vulnerability  library  provides  the  latest  known  or  experieneed 
paths  that  lead  through  the  system’s  vulnerabilities  and  the  ways  to  bloek  these  paths.  In 
addition,  the  vulnerability  library  needs  to  remain  current  with  the  technology  of 
vulnerability  detection  devices  and  intrusion  alert  mechanisms.  All  of  these  factors  are 
important  to  the  analysis  of  the  risks  the  network  faces  and  to  the  assessment  with  which 
hey  are  dealt. 

When  considering  the  system’s  vulnerability,  the  security  specialist  must  look  at 
detecting  intrusions  into  the  system’s  vulnerabilities,  identifying  the  vulnerability  to  be 
exploited,  and  the  proper  countermeasure  to  prevent  the  intrusion  from  proceeding.  In 
addition,  consideration  must  be  made  as  to  how  the  system  maintains  its  minimum  or 
critical  functionality  if  the  intrusion  is  successful,  and  how  the  system  limits  the  extent  of 
the  intrusion  and  stops  any  progressive  or  cascading  effects  to  the  system  due  to  the 
intrusion.  Many  methods  exists  for  intrusion  and  vulnerability  detection  including  Rule- 
based  pattern  recognition,  forced  intrusions  to  make  the  system  react  before  an  attacker 
penetrates  the  system,  and  inference-based  testing  to  recognize  the  interrelationships 
between  components  in  the  threat  environment.  Identification  of  the  intrusion  once 
detected  is  important  to  know  how  to  stop  it.  Access  to  a  database  that  catalogues  known 
malicious  agents  is  important  to  identification  and  resolution  of  an  intrusion  on  a  system. 
It  is  also  a  key  element  in  the  identification  of  countermeasures,  which  can  be  applied  in 
the  planning  and  design  of  a  system  or  during  operation  and  recovery  of  the  system  once 
an  intrusion  has  been  detected. 

Taking  the  analysis  or  the  threat  agent  and  the  system  vulnerability  that  the  threat 
agent  might  exploit  is  one  of  the  key  elements  in  the  calculation  and  assessment  of  risk, 
and  is  crucial  in  managing  that  risk. 

3,  Threat  Assessment 

A  threat  to  a  distributed  networked  information  system  is  comprised  of  an 
adversary  (an  entity  with  intent  to  cause  harm  or  disruption)  who  uses  tools  (viruses, 
worms,  information  overload,  software  altering  devices)  to  produce  an  undesirable  effect 
(information  denial,  corruption,  theft,  fabrication),  which  reduces  capabilities  or  causes 
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harm  in  safety-critical  applications.  The  military  make  its  business  out  of  knowing  who 
the  threat  is  today,  but  it  is  quite  uncertain  who  the  adversary  is  going  to  be  in  the  future. 
Also,  in  today’s  environment  with  an  asymmetric  enemy,  it  is  the  opinion  of  this  paper 
that  the  identification  of  the  threat  can  get  muddled  in  the  identification  of  the  loyalties  of 
the  threat  and  ideals  that  motivate  them.  By  not  knowing  whom  the  adversaries  are,  one 
is  less  certain  of  the  tools  they  might  use,  what  their  motivation  is,  and  what  effect  they 
want  to  inflict  on  the  distributed  network  system. 

Much  work  has  been  done  both  in  the  public  sector  by  the  major  producers  of 
software  and  computer  and  network  systems,  and  in  a  less  visible  manner,  in  the  military 

to  categorize  and  recognize  the  vast  array  of  methods  and 
tools  available  to  an  adversary.  The  major  effort  has  been  in 
building  libraries  of  threat  data  and  techniques  with 
appropriate  remedies.  In  the  simplest  terms,  companies  such 
as  Symantec  and  McAfee  who  sell  remedies  to  private 
consumers  through  distributed  software  and  routine  programs  maintain  the  libraries  in  the 
public  sector. 

To  be  classified  as  a  threat,  a  threat  agent  needs  to  have  opportunity,  motivation, 
resources,  inside  knowledge,  and  a  finite  amount  of  time  to  accomplish  an  objective.  If  it 
is  too  hard  to  penetrate  a  system,  or  once  penetrated,  too  difficult  to  achieve  the  desired 
effect  (such  as  denial  of  service),  or  it  takes  too  many  resources  or  time,  or  the  desired 
effect  does  not  have  the  impact  that  would  make  it  worth  the  expenditure  of  those 
resources,  then  the  likelihood  of  the  adversary  actually  attempting  to  breach  a  system 
becomes  less.  It  is  also  possible  that  other  inhibiting  or  amplifying  factors  may  come  into 
play  to  make  an  entity  a  threat.  Deception  can  act  as  an  inhibitor  and  perceived  lack  of 
retribution  can  be  an  amplifier. 

The  assessment  of  the  threats  that  are  a  potential  risk  to  a  networked  system  is  an 

integral  part  of  the  overall  risk  assessment  and  management  of  those  risks.  Most 

methodology  for  determining  the  source  and  the  qualities  of  a  threat  (Jones  &  Ashenden, 

2005;  Buckshaw  Parnell,  Unkenholz,  Parks,  Wallner,  &  Saydjari  2008;  Hamdi  & 

Boudriga,  2005)  rely  on  finding  a  logical  grouping  of  agents  to  decompose  further  into 
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their  motivation,  capabilities,  and  resources  as  well  as  the  triggering  factors,  or  catalysts, 
and  the  timing  that  would  cause  a  threat  agent  to  exploit  a  system's  vulnerability.  Each  of 
these  categories  can  be  further  decomposed  and  weighted  on  a  constructed  scale  (when 
no  natural  scale  is  available).  In  their  Mission  Oriented  Risk  and  Design  Analysis 
(MORDA)  model  of  risk  assessment,  Donald  Buckshaw  et  al.  (2008,  p.  24)  use  what  they 
call  swing  weight  matrix  in  their  determination  of  value  for  the  level  of  threat  to  a  system 
as  well  as  the  other  components  they  consider  in  their  assessment  of  risk.  The  swing 
weight  matrix  consider  the  change  in  relative  importance  between  different 
measurements  of  value  when  the  value  measurements  go  from  worst  to  best  possible 
level.  An  overview  of  their  methodology  is  contained  in  Appendix  B,  examples  of  risk 
models.  It  is  important  in  any  assessment  system  to  weight  the  factors  properly  relative  to 
one  another,  to  know  why  the  weighting  is  being  distributed  a  certain  way  over  an 
alternative.  Since  much  of  the  input  data  used  in  a  given  methodology  is  subjective  in 
nature,  it  is  important  to  process  that  data  in  a  logical  sequence  so  that  the  process  is 
repeatable  (especially  if  the  model  is  used  in  a  continuous  analysis)  and  the  results 
obtained  are  consistent  and  justifiable  within  the  constructive  scale  chosen.  Finally,  the 
results  have  to  be  useful  for  decision  making,  which  requires  an  understanding  by  the 
decision  maker  of  the  basis  on  which  the  decision  is  being  made. 

When  looking  at  threat  amplifiers  (events  or  motivating  factors  that  encourage 
attack)  and  inhibitors  (change  in  cultural  attitudes  or  political  roadblocks)  in  regard  to  a 
specific  threat  agent,  the  effect  the  amplifier  or  inhibitor  has  on  the  agent  must  be 
considered,  on  the  environment  in  which  the  agent  and  the  protected  system  exist  and  on 
the  system  being  protected.  Consider  an  amplifier  of  a  search  for  recognition  by  a  threat. 
It  has  an  effect  on  the  threat  and  the  system,  but  the  environment  may  not  come  into  play; 
whereas,  the  speed  in  which  technology  changes  and  improves  affects  the  system  and  the 
environment,  but  any  adjustment  to  the  magnitude  of  change  probably  does  not  affect  the 
threat  agent. 

As  one  piece  of  the  puzzle  of  gaining  an  understanding  of  the  risks  to  distributed 
networked  systems,  a  knowledgeable  threat  assessment  provides  an  overall  risk 
assessment  with  valuable  information.  Without  a  threat,  the  fact  that  vulnerabilities  exist 
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in  the  information  systems  beeomes  aeademie;  an  exereise  in  teehnologieal  expertise  to 
plug  the  holes  and  reaeh  a  eertain  level  of  perfeetion.  However,  this  is  eertainly  not  the 
ease.  Numerous  threats  are  present  and  are  waiting  for  the  right  time  to  exploit  networks 
to  aehieve  the  goal  of  degrading  or  incapaeitating  information  eapability.  The  threat  is  not 
just  from  the  outside  (although  outsiders  may  already  have  achieved  a  position  inside 
U.S.  systems,  either  with  “time  bombs”  in  the  software  or  by  compromising  users).  The 
Computer  Emergency  Response  Team  (CERT)  at  the  Software  Engineering  Institute  has 
examined  case  studies  of  insider  malicious  activity  to  include  sabotage,  espionage,  fraud, 
theft,  and  manipulation.  They  have  developed  a  simulation  to  help  uncover  potential 
cases  of  insider  malicious  activity  through  the  use  of  behavior  modeling  and  the 
consequences  of  improper  applications  of  new  technology  as  well  as  practices  for 
personnel  screening  and  authorization  to  systems.  This  study  has  focused  lately  on  the 
potential  for  insider  activity  during  the  system’s  development  lifecycle,  where  virulent 
code  has  been  inserted  during  development  among  other  deleterious  actions  that  may  take 
years  to  uncover  (CERT,  2008).  Knowing  the  threats  and  planning  for  their  reduction  is 
an  important  step  in  any  risk  management  process  both  before  a  system  degrades  or  fails 
and  equally  important  when  planning  the  operation  of  a  system  at  reduced  capability  and 
how  to  recover  as  quickly  as  possible. 

4,  System  Recovery 

A  subsequent  chapter  explores  what  makes  a  system  survive  in  a  hostile 
environment,  particularly  a  distributed  network  of  information  systems  relying  on  one 

another  for  complete  functionality  and  the  ability  to  meet 
objectives,  and  how  network  survivability  is  inherent  in  the 
network  enterprise  risk  management  approach.  This  section 
examines  an  overview  of  what  element  recoverability  plays  in 
a  risk  assessment  process  and  what  is  meant  by  reactive  risk 
analysis.  Most  methodologies  and  risk  assessment  models  focus  on  a  risk  management 
scheme  to  prevent  the  undesirable  outcome.  This  is  why  recovery  or  reaction  is  routinely 
left  as  an  afterthought  in  most  risk  assessment  models.  Once  a  threat  has  exploited  a 

system’s  vulnerability  and  achieved  its  objective  (disruption,  theft,  or  denial  of  service), 
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the  risk  has  manifested  itself  Therefore,  what  does  reaetive  risk  and  system  reeovery 
have  to  do  with  risk  assessment,  and  why  should  it  not  be  subordinated  but  treated  as  an 
equal  step  in  the  risk  assessment  proeess?  Prineipally  beeause  it  eannot  be  avoided  or  all 
risks  mitigated  to  zero,  and  with  a  system  as  eomplex,  and  with  such  determined  and 
ingenious  adversaries,  it  is  not  a  question  of  if  but  of  when;  and  it  is  absolutely  necessary 
to  be  ready  for  this  eventuality. 

The  major  measurement  of  reactive  risk  assessment  is  time  and  the  major  tool  in 
the  risk  assessment  arsenal  to  handle  this  ability  to  react  is  the  intrusion  detection  device. 
The  element  of  uncertainty  in  this  case,  which  makes  this  applicable  to  the  risk 
management  process,  is  how  much  time  is  available  for  a  given  attack,  how  much  time 
before  losing  the  minimum  amount  of  capability  needed  to  accomplish  the  mission,  and 
does  the  intrusion  detection  give  the  system  time  to  react,  or  more  simply,  does  it  detect  it 
at  all  or  does  it  detect  too  much  (false  alarm).  In  the  survivability  chapter,  quantification 
of  these  survivability  characteristics  are  examined  by  exploring  some  system  models 
using  optimization  and  simulation  to  characterize  the  survivability  of  a  network.  The 
critical  element  in  these  models  is  what  value  these  characteristics  provide  the 
survivability  of  the  network  for  it  to  meet  requirements.  The  other  critical  element 
cursively  mentioned  in  the  model  is  how  the  requirements  for  survivability  are 
determined  and  at  what  point  in  the  system  life  cycle  should  the  survivability 
requirements  be  decided.  The  contention  of  this  thesis  is  that  protection,  risk  mitigation 
and  survivability  system/network  requirements  must  be  integrated  into  the  general 
requirements  definition  phase,  and  they  ought  not  to  be  dealt  with  apart  from  the  rest  of 
the  systems  engineering  process. 

G.  ATTACK  TREES:  A  USEFUL  TOOL  IN  RISK  IDENTIFICATION  AND 

ASSESSMENT 

Drawing  at  once  on  the  concepts  of  game  theory  and  fault  trees  used  in  reliability 
analysis,  attack  trees  can  be  useful  in  identifying  and  analyzing  network  vulnerabilities 
and  the  paths  that  can  be  exploited  to  gain  access  to  the  assets  of  the  network.  At  the 
same  time,  they  are  useful  in  gaming  the  attributes  of  a  threat  that  would  make  the  threat 

more  or  less  likely  to  make  an  attack  on  the  network.  As  discussed  above  under  threat 
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assessment,  a  threat  agent  has  multiple  motivations  on  whether  to  exeeute  an  attack  or  not 
to  attack  a  network  with  multiple  constraints.  Several  of  the  risk  management  process 
models  discussed  in  Appendix  B  use  attack  trees  in  their  identification  and  assessment  of 
network  vulnerabilities  and  the  motivating  factors  of  threat  agents  (Buckshaw  et  al., 
2005;  Hamdi,  2005;  Hernandez,  2001;  Jones,  2005). 

1.  Identifying  and  Correcting  Vulnerabilities 

Attack  trees  use  the  same  methodology  as  fault  trees  in  analyzing  the  paths  to 
system  faults  when  determining  overall  system  reliability.  The  attack  tree,  shown  in 
Figure  15,  takes  the  ultimate  objective  of  the  attacker  as  the  root  of  the  tree,  then  expands 
the  tree  through  the  different  logical  steps  an  attacker  would  be  required  to  take  to  reach 
the  ultimate  objective.  Developing  the  attack  tree  leads  network  developers  to  discover 
where  the  networks  vulnerabilities  are  and  the  paths  to  get  to  a  vulnerable  network 
attribute.  Analysis  of  the  attack  tree  paths  also  reveals  methods  to  eliminate  or  reduce  the 
vulnerabilities,  while  in  turn  informing  them  of  what  the  effect  on  system  functionality 
might  for  certain  remedies  to  fix  a  vulnerability. 
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Attack  tree  analysis  of  the  route  an  attacker  takes  helps  to  organize  the 
evaluation  of  network  vulnerabilities  and  identify  those  needing  to  be 
fixed. 

Figure  15.  Sample  Attack  Tree  Analysis  of  a  Threat 


In  Figure  15,  the  logic  operators,  AND  &  OR  gates,  identify  either  alternative 
paths  to  the  ultimate  objective  (OR)  or  the  combination  of  steps  required  to  reach  the  nest 
level  (AND).  For  instance,  the  OR  gate  (red  arrow  in  Figure  15)  affects  database 
confidentiality  since  it  is  vulnerable  from  either  access  to  the  database  proper  or  from  an 
unauthorized  download  of  information  to  a  network  enclave  of  lesser  classification  and 
easier  access  directly  from  the  Internet.  The  AND  gate  (blue  arrow  in  Figure  15)  acts  on 
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the  unauthorized  eross-domain  transfer  by  requiring  the  use  of  a  paeket  sniffer  and 
translation  of  XML  seeurity  addressing  to  get  to  the  information  unintentionally  moved  to 
a  less  secure  location  on  the  network. 

2.  Minimal  Cut  Sets 

Minimal  cut  sets  (Haimes,  2009)  in  the  attack  tree  can  point  the  analyst  to  the 
likely  scenario  of  attack  and  where  to  look  within  the  individual  components  of  the 
system  for  the  vulnerabilities  requiring  the  most  attention.  The  minimal  cut  set  is  the 
minimum  set  of  attack  nodes  an  attacker  needs  to  take  to  reach  the  top  goal  of  the  attack. 
Additionally,  the  analysis  of  possible  scenarios  following  certain  paths  through  the  attack 
tree  can  lead  an  analyst  to  discover  patterns  in  the  way  attackers  reach  their  objectives 
and  may  be  supportive  in  developing  strategies  to  counter  similar  scenarios  in  multiple 
network  systems.  Attack  patterns  can  be  built  into  scenarios,  which  take  into  account  the 
following. 

•  Attack  objective  (base  data) 

•  Preconditions  set  to  motivate  the  attacker  to  achieve  the  objective 

•  Attack  mechanisms,  tools,  paths  chosen,  etc. 

•  Changes  to  system  state  as  an  attacker  proceeds  down  the  scenario  path  to 
achieve  the  objective 

This  analysis  leads  into  the  second  part  of  the  game  theory  of  the  attack  tree 
analysis;  identifying  and  analyzing  the  threat  attributes  leading  an  attacker  to  chose  a 
certain  vulnerability  path.  This  analysis  supports  the  decision  of  where  to  allocate 
resources  to  mitigate  the  risk  from  a  given  vulnerability. 

3.  Identifying  and  Analyzing  Threat  Motivations  and  Constraints 

Similar  to  the  quantitative  analysis  of  the  fault  tree,  the  probability  of  a  threat 
choosing  a  primary  entry  point  on  the  attack  tree  coupled  with  the  likelihood  of  being 
able  to  proceed  to  the  next  step  yields  a  value  of  risk  with  each  step.  For  a  given  path  up 
the  tree  to  the  ultimate  objective,  the  probability  that  a  threat  would  reach  the  ultimate 
objective  is  calculated  similar  to  the  composite  failure  rate  calculated  from  a  fault  tree. 
However,  as  shown  in  Figure  16,  the  quantitative  values  at  each  node  of  the  fault  tree 


86 


consist  of  a  n-tuplet  of  subjective  values,  the  eost  to  the  attacker,  the  level  of  technieal 
capability  of  the  attaeker  to  reach  that  step,  the  attaeker’s  pereeption  of  the  likelihood  of 
being  detected  before  or  during  an  attack,  the  attacker’s  perception  of  the  value  of 
reaehing  the  ultimate  goal,  and  other  motivating  factors.  If  a  reasonable  quantifieation  of 
the  values  associated  with  the  attacker’s  several  motivating  factors  can  be  agreed  upon, 
and  a  reasonable  limitation  to  the  number  of  values  can  be  reached,  it  might  be  possible 
to  achieve  a  quantitative  value  for  the  probability  of  each  attack  path  for  comparison  and 
determination  of  where  countermeasure  resources  should  be  allocated.  This  calculated 
value  is  ealled  the  propensity  for  the  attack  path  by  Amenaza  Technologies  (Ingoldsby, 
2009). 


Cost  of  attack  to  the  threat 
Perceived  probability  of  getting  detected 
Technical  capability  to  get  to  this  node 
''  Value  of  the  ultimate  asset 


Other  motivating  factors? 


Attack  trees  can  be  used  in  a  game  theory  approach  to  evaluate  an  attacker’s 
motivation.  Enumeration  allows  comparison  of  the  different  paths  to 
determine  the  most  likely  route  an  attacker  might  take  to  get  to  the  objective. 


Figure  16.  The  Attack  Tree  from  the  Attacker’s  Perspective 


Carrying  this  methodology  a  step  further,  a  network  of  ares  (paths)  to  get  to 
successive  nodes  (components  in  the  system  and  similar  to  the  nodes  on  the  attack  tree) 
can  be  formulated  and  analyzed  using  various  network  analysis  tools  to  determine  the 
shortest  path  to  reach  an  objective  and  changes  in  node  states  that  would  indicate  the 
presence  of  an  intruder  (Hamdi  &  Boudriga,  2005).  In  attack  tree  analysis,  the 
preconditions,  post  conditions,  and  the  steps  are  combined  into  each  node  on  the  attack 
tree.  In  network  methodology,  conditions  are  separated  from  the  steps  between  arcs  and 
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nodes.  Both  analyses  give  more  rigor  to  the  threat  assessment  of  the  risk  analysis  proeess 
and  lend  themselves  to  the  development  of  better  intrusion  deteetion  deviees,  and  other 
network  vulnerability  eontainment  features.  It  also  allows  the  analyst  to  eonsider 
seenarios  involving  multiple  attackers  on  one  system  that  may  desire  to  conduct  a 
coordinated  attack.  Computer  models  have  been  developed  that  consider  the  coordinated 
attack  as  the  union  of  more  than  one  individual  attack  scenario  and  seek  to  model  the 
system’s  states  as  a  coordinated  attack  occurs. 

H.  SUMMARY 

Network  enterprise  risk  analysis  is  an  important  part  of  an  lA  program  and  it 
should  be  implemented  at  the  beginning  of  a  network  system’s  lifecycle.  The  risk 
analysis  and  management  process  follows  the  steps  of  general  risk  management  processes 
for  safety,  program,  operational,  and  enterprise  risks  in  organizations.  Unique  to  network 
risk  management  is  the  concentration  on  an  unpredictable  threat  who  is  motivated  to 
exploit  network  vulnerabilities  that  the  threat  discovers  for  an  ultimate  goal  of  gaining 
something  of  value. 

The  network  risk  management  process  identifies,  assesses  and  manages  risks  to 
the  network  by  implementing  mitigation  strategies  to  reduce  vulnerabilities  and  improve 
network  resiliency  through  recovery  and  survivability  techniques.  With  mitigation 
strategies  in  place,  the  process  then  requires  that  the  network  be  monitored  for  the  results 
of  the  mitigating  actions  to  observe  effects  on  network  functionality  and  the  network’s 
ability  to  resist  or  handle  threats.  Based  on  the  effectiveness  of  the  mitigating  actions  as 
observed  by  monitoring  or  by  changes  to  network  requirements  or  the  environment 
surrounding  the  network,  the  risk  management  process  is  taken  through  another  iteration 
to  support  improvements  to  the  networks  adaptability  based  on  experience. 

Attack  trees  are  a  useful  tool  in  analyzing  network  vulnerabilities  and  assessing 
the  motivating  factors  of  threats  to  attack  a  network. 
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V.  NETWORK  SURVIVABILITY  AND  RESILIENCY 


A.  INTRODUCTION 

This  chapter  examines  what  survivability  and  resilieney  mean  and  why  the 
concept  is  important  as  part  of  a  robust  network  enterprise  risk  management  proeess.  It 
explores  some  of  the  ways  to  make  a  network  more  survivable  and  resilient.  It  also 
discusses  the  basic  framework  of  select  system  models  developed  by  academieians  for 
quantifying  network  survivability,  what  the  quantification  of  survivability  can  do  to 
support  improving  the  risk  management  proeess,  and  the  limitations  of  the  models  in 
providing  concrete  evidence  for  quality  deeision  making  in  global  networked  system  of 
systems  built  on  the  scale  of  the  GIG  and  systems  based  on  Serviee  Oriented 
Arehitecture. 

While  risk  analysis  and  management  are  designed  to  find  and  fix  vulnerabilities 
that  put  the  network  at  risk  by  the  threat  that  exploits  them  with  the  intent  to  gain  aceess 
to  valuable  information  system  assets,  survivability  is  the  attribute  of  a  system  that 
defines  how  it  deals  with  an  actual  exploitation  of  network  vulnerabilities  that  have 
remained  after  mitigation  implementation.  The  arehitecting  of  a  system  before  attack  to 
respond  to  attaek  after  other  risk  mitigation  implemented  plans  have  been  aetivated  to 
resist  attack  by  mitigation  plans  has  reduced  network  vulnerability.  In  other  words, 
survivability  and  resiliency  are  defense  in  depth  for  a  network  by  designing  the  capability 
to  continue  action  to  resist  and  recover  after  an  attack  scenario,  not  just  the  mitigation 
strategies  put  in  place  before  attack.  While  it  is  vitally  important  to  manage  the  risk  to  a 
network  before  attaek  and  to  make  every  effort  to  keep  it  from  happening,  a  further 
defense  meehanism  and  process  needs  to  be  in  place  in  the  event  of  an  attack. 

For  many  of  the  critical  networks  upon  which  military  forces  rely  to  obtain  the 
information  needed,  it  is  essential  that  their  eritical  networked  eommunications  and 
applieations  be  designed  to  survive,  because  by  nature,  they  are  exposed  to  a  hostile 
environment.  It  is  highly  probable  that  an  adversary  is  attempting  to  disrupt  the  military’s 
computer  networks  and  their  networked  operations  to  gain  the  edge  in  any  level  of 
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operation,  from  high  intensity  eonflict  to  stabilization  and  peaee-keeping  operations.  If  a 
network  can  be  designed  or  modified  to  be  more  survivable,  it  can  strengthen  its  defenses 
against  those  risks  for  which  it  was  unable  to  mitigate  or  remove  fully  prior  to  attack. 
This  adds  flexibility  to  the  management  of  risk  up  front,  allowing  for  greater  options 
when  deciding  how  to  deal  with  the  risks  identified  at  the  beginning  of  the  risk 
management  process.  This  is  especially  important  when  managing  the  risk  on  critical 
network  systems  since  any  intrusion,  no  matter  how  small  the  probability,  could  put 
important  operations  in  severe  jeopardy.  Not  only  is  it  desirable  to  architect  a  network 
less  vulnerable  to  failure  (failure  being  deliberate  or  accidental  intrusion  or  because  of 
hardware/software  malfunction),  but  also  design  the  network  to  be  resilient  and  adaptable 
under  attack  and  failure,  and  to  be  able  to  recover  in  time  to  minimize  the  disruption  to 
the  completion  of  critical  tasks.  The  ultimate  goal  in  architecting  a  network  for 
survivability  and  resiliency  is  to  improve  its  operational  effectiveness,  safety,  and 
affordability  through  the  refinement  of  technical  effectiveness,  system  effectiveness, 
system  availability,  and  cost  controls  (Ellison  Fisher,  Linger,  Lipson,  Longstaff,  &  Mead, 
1999). 

B,  DEFINING  SURVIVABILITY  AND  RESILIENCE 
1.  Network  Survivability  Characteristics 

a.  Susceptibility 

Susceptibility  is  the  capability  (or  lack  thereof)  to  avoid  an  attack.  It  is 
essentially  the  converse  of  looking  at  the  threat  agent.  Chapter  IV  discussed  using  a  game 
theory  approach  with  the  attack  tree  method  for  analyzing  a  threat’s  motivation  and 
capability  to  exploit  a  vulnerability  and  conversely  determining  which  paths  through  the 
tree  are  most  vulnerable  so  the  vulnerability  can  be  mitigated.  Susceptibility  approaches 
the  game  theory  from  the  standpoint  of  analyzing  for  the  network’s  strengths  and  tactics; 
particularly  the  network’s  ability  to  recognize  an  attack  and  avoid  it.  In  any  system, 
susceptibility  can  be  reduced  by  using  decoys/deception,  removing  the  system  from  the 
adversary’s  theater  (or  not  letting  network  boundaries  interact  with  the  adversary’s 
system  boundary),  and  by  the  capability  to  recognize  an  attack  on  the  system  (intrusion 
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detection)  in  time  to  take  evasive  or  defensive  action.  In  a  computer  network,  this  can  be 
accomplished  by  the  correct  design  and  operation  of  intrusion  detection  devices,  policies 
as  to  when  and  where  networks  are  accessed,  and  tactics  as  innocuous  as  the  “honey  pot;” 
drawing  an  adversary  to  a  place  on  the  network  that  looks  appealing,  but  is  a  trap  to  keep 
the  threat  from  reaching  valuable  data,  applications  or  network  functionality  (Ellison  & 
Moore,  2001). 


b.  Vulnerability 

In  survivability  and  resiliency  analysis,  susceptibility  is  vulnerability  go 
hand  in  hand.  When  looking  at  a  network’s  vulnerability  from  the  survivability 
perspective,  the  system  weaknesses  and  the  paths  adversary  might  take  to  exploit  the 
system  to  achieve  an  objective  are  not  the  only  factor.  Also,  an  analysis  of  the 
interrelationships  between  known  areas  of  exploitation  for  the  effects  of  that  exploitation 
must  be  done;  what  the  immediate  damage  to  the  system  would  be  and  if  the  effects  cause 
a  progression  of  damage  to  the  network  in  a  linear  or  cascading  (arithmetic, 
multiplicative,  or  exponential)  manner.  Survivability  analysis  seeks  to  define  where 
damage  could  occur  on  the  network,  what  the  extent  of  the  damage  is  and  how  that 
changes  over  time,  and  what  methods  or  technology  might  be  employed  to  limit  of  stop 
damage  to  the  network  once  an  adversary  has  discovered  and  begun  to  exploit  that 
vulnerability  (Ellison  &  Moore,  2001). 

c.  Recoverability 

Analysis  and  design  for  recoverability  of  the  networked  system  parallels 
reactive  risk  management  processes  and  methods  described  in  Appendix  C. 
Recoverability  is  the  quality  of  robustness  displayed  by  a  system  that  can  return  to 
normal  operation  after  an  attack  by  a  threat  agent.  Recoverability  is  the  robustness  of  a 
system  to  return  to  normal  operation  after  an  attack  by  a  threat  agent.  It  is  defined  by  how 
the  system  and  the  operators  respond  to  an  attack.  However,  the  recoverability  of  a 
network  must  be  designed  into  the  system  both  technologically  and  procedurally.  In 
critical  networked  computer  systems,  continued  operation  in  the  face  of  an  attack  may 
require  instant  response  and  can  only  be  controlled  by  an  automatic  response  mechanism 
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such  as  a  switchover  or  an  auctioneered  system  that  ean  assume  the  eritieal  eapability.  If 
time  is  eritieal  but  not  immediate,  system  design  should  consider  the  intermediate  stages 
that  ean  be  reeovered  in  the  faee  of  an  attaek  and  still  maintain  eritieal  functionality. 
Also,  consideration  must  be  given  to  the  network’s  eapability  to  provide  eritieal  services 
at  a  redueed  eapability  for  an  extended  period  of  time  and  how  mueh  and  how  long 
degradation  to  the  operational  objeetive  is  permissible  until  full  system  reeovery  or  until 
the  objeetive  is  not  attainable  even  with  later  reeovery.  An  important  aspeet  of 
reeoverability  is  the  ability  of  a  network  to  adapt  in  the  faee  of  an  attaek  and  to  learn  from 
that  adaptation  to  make  the  network  more  resistant  to  future  attaeks.  However,  as  the 
system  beeomes  more  resilient  to  attacks  through  experience,  the  threat  agent  can,  and 
does  change  taeties.  The  ultimate  capability  of  a  recoverable,  adaptable  system  is  to  learn 
from  the  present  attaek  and  be  able  to  predict  how  the  next  attaek  morphs  so  that  the 
system  is  less  suseeptible  to  the  next  attaek.  How  to  design  perspieaeity  into  a  global 
networked  information  system  ean  be  somewhat  problematic  (Ellison  &  Moore,  2001). 
In  networked  system,  it  is  important  to  design  in  redundaney  and  dispersion/distribution 
(through  enelaves  or  other  separation  sehemes).  When  the  attaek  has  started,  the  system 
must  be  able  to  redireet  the  energy  of  the  attaek  (denial  of  service  redireeted  to  an 
inactive  server),  systems  designed  for  optimum  fault  isolation  and  load  shedding,  ability 
to  eompensate  for  services  or  capabilities  that  are  damaged  or  stolen,  and  the  adaptability 
to  reeombine  system  components  automatieally  to  reeonstitute  eritieal  serviees.  On  top 
of  the  automatie  adaptations,  proeedures  and  operator  instructions  that  are  pre-planned 
and  praetieed  to  reeonfigure  network  conneetions  and  even  architeeture  are  neeessary  for 
eontinued  uninterrupted  operation  and  eontinuation  of  eritieal  operations 

2,  Designing  a  Network  for  Survivability 

a.  Designing  Survivable  Networks  at  the  System  Boundaries 

Designing  a  network  for  survivability  is  scenario-driven  by  the  seenario  of 
an  attaek  by  a  given  threat  to  the  network.  The  architeetural  deeisions  are  dependent  on 
what  the  attaek  might  be  and  the  probability  of  the  attaek.  However,  the  arehiteeture  is 
also  guided  by  the  requirements  definitions.  The  ehallenge  in  designing  survivable 
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systems  is  to  determine  how  to  define  the  requirements  of  a  survivable  and  resilient 
system  sinee  survivability  depends  on  the  type,  timing,  and  depth  of  attaek.  During  the 
requirements  definition  phase  of  a  system’s  lifeeyele,  requirements  are  defined  and  given 
speeifications  based  on  the  Coneept  of  Operations  (CONOPS),  whieh  is  used  to 
determine  the  system’s  eapabilities.  The  CONOPS  is  also  seenario  driven,  but  with  the 
seenario  desired  to  meet  the  organization’s  goals  derived  by  effeets-based  thinking  (the 
effeets  the  organization  desires  to  aehieve  from  the  operation  of  the  system,  among  other 
aetions).  However,  the  CONOPS  seenario  is  somewhat  deterministie  whereas  the  threat 
seenarios  are  stoehastie. 

One  loeation  in  a  network  (or  any  system  for  that  matter)  where  the 
probability  of  attaeks  from  threats  ean  be  somewhat  eonstrained  is  at  the  system 
boundaries.  Mueh  the  same  as  in  movement  warfare,  designing  a  network  for 
survivability  and  resilieney  depends  on  where  the  line  is  held  against  attaek,  or  what 
system  boundary  is  the  last  strong-hold  to  ensure  the  network  delivers  required  eritical 
serviees  in  the  face  of  attack.  As  seen  in  the  attack  tree  analysis,  the  attacker  enters  into 
the  network  by  exercising  sequences  of  interdependent  decisions  to  produce  (the  attacker 
hopes)  undetected  and  disastrous  consequences  to  the  functionality  and  information  of 
the  network.  McCabe  Software®  uses  a  software  validation  approach  to  check  for 
trustworthiness  of  software  paths  and  to  uncover  security  flaws  in  software  code  by 
analyzing  the  control  flow  paths  and  verifying  control  flow  integrity.  This  method  is  a 
way  of  drawing  the  line  (at  the  network  boundary  as  defined  in  the  requirements)  on  an 
adversary’s  attack  (McCabe,  2009). 

b.  Designing  Survivable  Systems  with  COTS  Software 

Designing  networks  for  survivability  and  resiliency  with  COTS  software 
as  prescribed  by  open  architecture  requirements  can  be  problematic  because  of  the  lack  of 
access  to  the  COTS  software  artifacts  or  the  COTS  engineering  process  when  COTS  are 
used  in  network  system  development.  Controlling  system  states  is  important  throughout 
the  development  cycle  of  software  development,  and  access  to  the  artifact  and 
engineering  process  of  COTS  is  important  in  providing  the  assurance  evidence  for  the 
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trustworthiness  of  the  developed  software.  As  the  realization  that  software  is  beeoming 
the  overall  eross-funetional  systems  integration  agent,  one  way  to  provide  this  assurance 
is  through  the  methodology  of  the  Capability  Maturity  Model  Integration  (CMMI)  levels 
of  software  development  (Chittister  &  Haimes,  2006).  Another  way  is  through  the 
methodology  of  a  Vendor  Risk  Assessment  and  Threat  Evaluation  Project  offered  by  the 
Carnegie-Mellon  Software  Engineering  Institute  CERT.  This  methodology  examines  the 
COTS  product  used  in  software  development  from  the  vendor’s  inherent  risk  elements 
(visibility  of  artifact  attributes,  vendor  performance  history,  trustworthiness,  vendor 
technical  competence  and  compliance  to  standards)  and  the  vendor’s  risk  associated  with 
the  developer’s  risk  management  skills  in  dealing  with  vendors  (technical  and  non¬ 
technical  risk  mitigating  factors,  exposure,  vendor  compatibility,  independence  and 
interdependencies)  (Ellison,  Einger,  Eipson,  Mead,  &  Moore,  2009).  The  human 
knowledge  role  in  software  development  should  not  be  overlooked  when  designing  for 
resiliency. 

The  focal  point  of  systems  integration  is  the  realization  that  all  hardware- 
software  systems  are  made  of  multiple  interdependent  sub-systems.  Each 
sub-system,  in  turn,  is  a  system  that  is  driven  by  its  own  state  variables, 
inputs,  outputs,  and  control  and  random  variables,  among  others,  where 
the  output  of  one  system  constitutes  the  input  to  others.  Understanding  this 
interconnectedness  and  the  interdependencies  among  these  many  sub¬ 
systems  is  imperative  for  an  effective  software  architectural  design  and  for 
ultimate  systems  integration  and  control.  (Chittister  &  Haimes,  2006,  p.  9) 

c.  An  Example  of  Survivability  and  Resiliency  Scenario-Driven 
Requirements 

If  survivability  and  resiliency  requirements  can  be  defined  at  the 
network’s  boundaries,  and  an  analysis  of  the  important  (most  probable  or  most 
devastating)  threat  scenarios  can  be  analyzed  for  flow  control  paths  using  attack  tree 
analysis,  then  this  data  can  be  used  in  a  survivability/resiliency  analysis  to  study  the 
effectiveness  of  the  network  Intrusion  Detection  Systems  (IDS),  and  reaction/recovery 
mechanisms  to  meet  minimum  standards  for  connectivity,  time  and  amount  of  lost  data 
(as  in  packet  loss  in  an  IP  network),  and  mechanisms  and  software  processes  to  establish 
and  safeguard  data  in  priority  order.  The  survivability/resiliency  studies  support  the 
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determination  of  whieh  survivability  primitives  of  redundaney,  replieation,  distribution, 
separation,  aceess  control,  diversity,  or  adaptive  reconfiguration  contribute  most  to 
making  a  survivable,  resilient  network.  Consider  the  following  example  of  possible 
survivability  requirements  driven  by  the  following  threat  scenario. 

It  is  desired  to  interconnect  in  an  information  systems  network  a  sensor 
system,  an  intelligence  analysis  system  and  a  weapons  control  system  to 
detect,  track  and  engage  a  theater  ballistic  missile  attack.  To  accomplish 
the  objective,  the  weapons  control  system  must  receive  information  from 
the  sensor  network,  validation  of  the  target  from  the  intelligence  analysis 
network,  and  verification  of  weapons  parameters  from  the  weapons 
control  network  to  engage  and  destroy  the  target.  Risk  analysis  determines 
that  the  highest  probability  threat  and  the  weakest  vulnerability  to  the 
networked  system  of  systems  is  the  alteration  of  missile  identification 
parameters  in  the  intelligence  network.  Without  valid  missile 
identification,  the  weapons  control  system  does  not  allow  engagement  of 
the  target  missile.  From  the  attack  tree  analysis,  a  path  of  a  threat  agent 
leading  to  missile  identification  data  is  mapped.  Survivability  analysis 
looks  at  the  probability  that  the  intrusion  can  be  detected  as  the  threat 
nears  the  goal  of  altering  the  intelligence  network,  how  the  damage  to  the 
system  progresses  over  time,  what  level  of  damage  can  be  tolerated  before 
the  correct  information  has  to  reach  to  the  weapons  control  network  to 
achieve  successful  target  engagement.  Resiliency  analysis  examines  how 
quickly  the  intelligence  network  can  stop  the  intrusion,  determine  what 
data  was  corrupted,  what  data  is  trustworthy,  and  how  soon  the  network 
can  return  to  normal  or  how  quickly  it  can  switch  to  an  alternate  network 
path  unaffected  by  the  intrusion. 

While  the  quantification  (such  as  it  is)  of  the  impact  of  a  network 
compromise  during  the  survivability/resiliency  analysis  is  similar  to  the  impact  variable 
in  preventative  risk  analysis,  the  probability  has  shifted  to  the  network’s  ability  to 
recognize,  react,  recover  and  adapt  as  opposed  to  the  probability  that  the  intrusion 
happens  in  the  first  place.  As  part  of  the  design  of  the  survivable  system,  the  goal  is  to 
determine  the  probabilities  attached  to  recognition,  reaction,  and  recovery  for  a  given 
intrusion  and  to  build  a  system  that  recovers  or  adapts  in  enough  time  to  meet  the 
system’s  mission.  When  the  important  threat  agents  are  considered,  their  interactions  at 
the  boundaries  of  the  network  have  been  studied,  and  the  possible  ways  they  could 
penetrate  the  network  and  cause  damage  from  attack  tree  analysis  using  a  type  of  game 
theory,  the  aggregation,  even  from  a  holistic  perspective,  of  the  scenarios  that  a  threat 
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might  come  from  is  no  small  matter.  Moreover,  with  the  ehanging  dynamie  of  a 
network’s  objectives  (espeeially  an  ad  hoe  network  in  a  wireless  or  satellite 
eommunieations  mode),  designing  resilieney  into  the  network  to  secure  its  assets  and 
functionality  requires  the  design  to  be  adaptable  to  aeeommodate  the  ehanging  objeetives. 

d.  Challenges  in  Quantifying  Survivable  Network  and  Software 
Attributes 

Quantifying  survivable  network  attributes  requires  the  determination  of 
what  attributes  of  a  network  make  it  more  survivable  and  how  they  are  quantified; 
attributes  such  as  connectivity  ratio,  quantity  and  quality  of  information  transmitted  and 
reeeived,  serviee  request  distortion  pereentage,  maximum  service  disruption  time,  and 
node  (server)  state.  To  improve  the  survivability  of  a  network,  design  trade-offs  need  to 
be  eonsidered  between  network  funetionality  under  normal  eonditions  and  system 
requirements  to  meet  mission  objeetives  under  seleeted  attaek  seenarios  and  whieh  of  the 
network’s  attributes  eited  above  are  most  important.  Survivability  analysis  of  eomputer 
networks  and  the  network  software  eontrolling  operations  has  peeuliarities  that  differ 
from  survivability  analysis  of  traditional  hardware  systems.  Software,  the  largest 
component  of  networks  and  arguably  the  most  vulnerable,  ean  be  attaeked  by  almost 
anyone  with  some  knowledge  of  eomputer  systems  and  programming  to  varying  degrees. 
It  is  not  suseeptible  to  obsoleseenee,  and  it  is  partieularly  difficult  to  uncover  all  its  faults 
(espeeially  in  COTS  as  noted  above)  and  to  intrusions  that  might  introduee  additional 
faults.  The  faults  are  not  randomly  distributed  between  a  elass  of  software  as  with 
eomponent  reliability  and  survivability  of  hardware  eomponents  of  the  same  type  or 
makeup,  so  software  faults  ean  defy  aeeurate  predietion.  While  some  of  the  same  tenets 
of  design  primitives,  whieh  make  a  physieal  system  more  survivable,  hold  true  for 
software  systems  and  eomputer  networks  such  as  geographical  separation,  redundaney  of 
units,  deeeption  teehniques,  and  human  aeeess  control;  software  and  network  systems 
offer  unique  eharaeteristies,  whieh  make  some  design  primitives  indispensible  when 
dealing  with  information  generation  and  information  flow.  Duplieation  and  rapid 
replieation  of  data  and  reeovery  of  eonneetions  in  networks  is  usually  not  possible  in 
physieal  systems,  but  with  the  eareful  design  of  network  eontrol  and  management 
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software,  these  primitives  ean  be  designed  in  to  improve  network  resilieney.  Also, 
especially  important  is  the  ability  of  a  network  to  reconfigure  adaptively  in  fractions  of  a 
second  if  necessary. 

C.  ARCHITECTING  SURVIVABLE  NETWORKS 

1,  Network  Attribute  Considerations 

Architecting  survivability  into  large,  distributed  computer  networks  draws  on  the 
disciplines  common  to  the  protection  of  critical  systems  such  as  security,  safety, 
reliability,  and  fault  tolerance  (Ellison  et  ah,  1999).  The  goal  of  designing  survivability 
into  networks  is  to  preserve  essential  services  that  allow  mission  completion  during 
network  intrusion  and  compromise.  Architecting  the  system  to  be  survivable  is  a 
balancing  act  between  the  network’s  design  to  support  the  maximum  functional 
capability  under  normal  operations  and  the  ability  to  defend  against  and  recover  from  any 
compromise  to  the  system  while  maintaining  critical  functions  during  attack  and 
recovery.  The  success  of  system  recovery  is  measured  by  the  system’s  ability  to  minimize 
the  consequences  of  degradation  to  system  functionality  and  to  the  network’s  critical 
attributes  by  preserving  data,  and  allowing  the  continued  execution  of  essential  services 
during  an  attack  of  the  system.  Critical  attributes  of  a  distributed  network  are  the 
capability  to  access  data  and  services  and  to  share  between  network  nodes,  and  to 
conduct  the  necessary  computations  and  data  manipulations  to  achieve  the  system’s 
objectives.  The  challenge  in  architecting  a  survivable  network  is  that  during  normal 
operation,  it  is  desired  to  minimize  the  constraints  placed  on  system  operation  and 
availability  in  the  form  of  system  defenses,  but  also  desirable  is  the  ability  to  have  the 
necessary  protections  in  place  (procedures  and  countermeasures)  to  resist,  recognize  and 
to  be  able  to  recover  from  system  degradation  or  failure,  whether  the  compromise  is  due 
to  system  fault,  unintentional  accidents,  or  intentional  attacks  by  intruders.  The  more 
survivable  the  network  is,  the  greater  its  tolerance  to  threats  and  the  lower  the  risk  to 
operations. 
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a.  Consequences 

In  architecting  a  network  for  survivability,  the  consequences  to  system 
operation  and  mission  fulfillment  that  various  threats  have  the  potential  to  cause  should 
vulnerabilities  be  exploited  must  be  considered.  This  approach  is  slightly  different  that 
the  classic  models  for  risk  analysis  and  assessment.  While  it  is  necessary  to  seek 
information  about  all  known  possible  threats,  where  they  might  originate,  the  threat’s 
capabilities,  motivations,  and  timing  to  cause  the  network  harm;  keeping  up  with  the 
multitude  of  constantly  evolving  threats,  known  and  unknown,  in  the  environment  around 
the  boundaries  to  a  system  can  be  an  insurmountable  task.  New  threats  and  threat 
tactics/methods  emerge  every  day  from  global  sources  of  unknown  entities  who  would 
like  to  intrude  on  U.S.  systems  for  various  reasons.  Since  the  universal  set  of  threat 
agents  potentially  staged  to  cause  system  harm  cannot  be  fully  known,  what  is  known  and 
understood  is  used  to  examine  it  from  the  survivability  (or  consequences)  angle.  To  do 
this  requires  looking  at  the  network  from  an  intruder’s  standpoint  and  examining  how  the 
network  developer  needs  to  architect  the  network  to  reduce  susceptibility.  In  the  areas 
where  it  is  susceptible,  an  examination  of  ways  to  mitigate  system  vulnerability,  and 
subsequently,  the  mechanisms  and  system  characteristics  is  conducted  to  handle  the 
consequences  of  a  given  attack  and  preserve  critical  attributes  in  the  face  of  an  attack. 

b.  Connectivity 

Analysis  of  architecting  for  survivability  is  complicated  by  today’s 
network  systems,  which  because  of  their  size,  are  unbounded  (or  an  unbounded, 
networked  system  of  systems  of  individual  bounded  systems).  This  is  especially  true  of 
DoD’s  GIG  form  a  top-level  view.  By  its  nature  and  global  size,  it  is  an  unbounded 
system,  much  like  the  Internet  (to  which  it  connects).  A  network  is  unbounded  in  the 
sense  that  no  single  or  group  of  participants  are  certain  of  who  is  part  of  the  entire 
network.  In  an  unbounded  network,  no  centralized  control  exists,  and  participants  have  to 
trust  the  other  members  on  the  network  to  comply  with  agreed  upon  standards.  Both 
legitimate  users  and  threat  agents  act  as  peers  on  the  net  (Maier  &  Rechtin,  2002;  Ellison 
et  ah,  1999).  Obviously,  without  boundaries,  a  network’s  environment  is  now  part  of  the 
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system.  Threat  agents  eome  from  the  environment.  Thus,  they  are  intrinsieally  part  of  an 
unbounded  system  and  can  be  thought  of  as  latent  participants  (until  they  attack  of 
course). 


c.  Control 

Another  complication  to  large,  highly  distributed  networks  is  the 
constraint  on  network  control  and  governance.  When  networks  were  small  and  bounded, 
their  architectures  were  built  around  centralized  governance  that  had  a  manageable  span 
of  control.  One  controlling  entity  had  the  power  to  enforce  policy  and  enact  sanctions  for 
inappropriate  network  conduct.  Network  protection  was  implemented  with  static 
countermeasures,  the  most  ubiquitous  being  firewall  installations;  first,  at  individual 
workstations,  then  in  front  of  local  area  networks,  and  later  enterprise  firewalls  were 
incorporated  in  the  network  architecture  and  integrated  with  the  desktop  firewalls. 
Changes  and  updates  to  firewall  configurations  were  pushed  to  users  by  local  system 
administrators  or  by  a  centralized  network  administrator  (NSA/SNAC/IAD,  2006). 
However,  this  architectural  arrangement  is  insufficient  to  protect  systems  as  networks 
take  on  global  proportions,  with  governance  decentralized  or  non-existent,  and  the 
network  characterized  by  interoperable  and  collaborative  system  of  systems.  Since 
control  of  distributed  networks  is  so  tenuous,  there  is  good  reason  to  consider  network 
risk  management  from  a  survivability  viewpoint.  With  an  inability  to  impose  controls 
throughout  the  entirety  of  the  network,  survivability  architecture  looks  at  consequence 
management,  focusing  on  the  mission  accomplishment  of  network  segments  and  on  the 
creation  of  the  ad  hoc  networks  envisioned  for  the  target  GIG. 

d.  Governance 

As  the  size  and  complexity  of  the  network  grows,  operating  standards  and 
protection  methods  become  schemas  agreed  upon  by  the  membership  of  the  network. 
When  an  organization  makes  the  decision  to  move  to  service  oriented  architecture,  the 
system  takes  on  more  characteristics  of  an  unbounded  network.  A  sophisticated 
architecture  of  countermeasures,  authentication  devices  and  procedures,  and  network 
behavior  recognition  and  intrusion  systems  must  be  introduced,  tested  and  monitored  by 
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someone  who  has  the  responsibility  for  aecomplishing  the  mission  and  ensuring  the 
availability  of  services.  Partitioned  segments  of  the  network  can  be  loosely  controlled  (as 
is  done  with  the  Internet  in  countries  desiring  some  political  control),  but  even  that 
governance  and  control  is  limited,  and  attempts  to  subvert  it  constantly  occurs.  The  U.S. 
military’s  GIG  has  a  better  chance  of  providing  some  governance  over  its  participants 
and  its  operations  because,  on  a  high  order  scale,  the  objectives  of  the  participants  on  the 
grid  are  the  focused  around  a  common  cause,  and  a  form  of  constructive  governance 
might  be  achieved.  However,  as  the  network  users’  objectives  become  more  refined  into 
distributed  and  parochial  operational  plans  and  tactics,  this  form  of  network  control 
becomes  more  diluted,  and  governance  of  the  net  tends  toward  the  problematic  and 
control  is  less  centralized.  This  fact  soon  becomes  more  apparent  as  the  DoD  lashes  the 
individual  services’  network  architectures  together  to  create  the  global  target  GIG.  The 
overall  joint  governance  of  the  net  most  likely  is  be  based  more  on  a  set  of  collaborative 
standards  rather  than  a  defined  rule  set  controlled  by  a  central  authority  such  as  DISA. 

e.  Communication 

In  a  distributed  network  architecture  with  services  spread  across  many 
domains  and  an  environment  of  diminished  trust  with  no  unified  system  administrative 
control,  survivable  network  design  relies  on  a  common  communications  and  routing 
systems  to  tie  distributed  services,  diversity  in  coding  and  protocol  mark-up,  and  node 
logic  systems  together  in  a  survivable  package.  Since  it  is  impossible  to  know  everything 
about  a  threat  agent,  the  survivable  network  requires  an  architecture  built  on  the 
interactions  between  nodes  such  that  protection  is  protocol-based  instead  of  architected 
for  a  given  network  topology.  The  network  must  have  some  system  of  trust  maintenance, 
and  must  have  system-wide  properties  that  do  not  reside  strictly  within  nodes  and  is 
emergent  and  stochastic  (Ellison  et  ah,  1999).  A  functional  decomposition  of  the 
networks  attributes  set  the  priority  for  where  resources  are  allocated  to  protect  and  to 
recover  those  network  attributes  that  must  be  maintained  in  the  face  of  an  attack,  or  must 
be  instantly  recovered  because  their  function  is  time-critical,  and  any  interruption  would 
cause  a  failure  of  the  mission. 
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The  capability  to  maintain  essential  services  (and  maintain  the  associated 
essential  properties)  must  be  sustained  even  if  a  significant  portion  of  the 
system  is  incapacitated.  Furthermore,  this  capability  should  not  be 
dependent  on  the  survival  of  a  specific  information  resource,  computation, 
or  communications  link.  (Ellison  et  ah,  1999,  p.  9) 

2,  Elements  in  the  Architecture  of  a  Survivahle  Network 

a.  Usage  Models 

To  determine  the  make-up  of  a  survivahle  network,  the  developer 
constructs  a  system  usage  model  for  the  network,  both  from  the  standpoint  of  a  legitimate 
user  and  also  an  intruder  (since  it  was  stated  that  they  act  as  peers  on  the  net).  The  usage 
model  shows  what  services  are  essential  and  non-essential;  the  timing,  load,  and  all 
possible  uses  of  a  network  service  (Ellison  &  Moore,  2001).  Eor  survivability,  the 
architect  must  determine  what  survivability  services  are  employed  and  their  allocation. 
Survivability  services  and  their  components  consider  a  threat  agent’s  capability  to  access 
the  network,  to  penetrate  nodes  of  the  network  and  systems  on  nodes,  its  ability  to 
navigate  within  nodes  to  discover  information,  and  the  ability  to  exploit,  or  corrupt 
services  in  the  node.  Survivability  services  are  as  follows. 

•  Resistance  to  intrusion  through  the  use  of  firewalls,  diversity  of  programs, 
and  encryption  and  authentication 

•  Recognition  through  intrusion  detection  devices,  anomaly  and  behavior 
patter  recognition,  trust  maintenance,  and  self  awareness  methods 

•  Adaptability  through  backup  programs,  alternate  connections,  scalable 
bandwidths,  learning  from  attacks,  nodes  share  fixes  with  each  other 

•  Recovery  through  redundancy  and  data/program  replication,  fault  tolerant 
mechanisms,  diversity  in  data  storage 

Since  it  is  nearly  impossible  to  tell  the  difference  between  users  on  a 
network,  the  only  way  to  determine  if  the  user  is  legitimate  or  hostile  is  to  observe  the 
user’s  behavior  with  respect  to  the  system.  Work  on  determining  how  to  tell  what 
differences  in  client  behavior  on  a  server  can  provide  clues  to  illegitimate  use  of  a 
network  has  been  conducted  by  the  Computer  Science  Department  at  Cornell  University, 
among  others.  They  have  adapted  the  timeout  feature  of  computer  network  fault  detection 
to  the  process  of  transactions  between  objects  and  the  timing  of  those  transactions  to 
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determine  if  a  fault  is  real  or  just  a  transitory  state  of  one  of  the  elients  on  the  network 
(e.g.,  entering  or  exiting  the  proeess  during  a  transaetion)  (Birman,  2009).  By  virtue  of  its 
reliance  on  the  service  oriented  architecture,  the  military’s  GIG  network  is  unbounded.  If 
a  network  partition  is  to  be  shutdown  due  to  detection  of  a  fault,  the  consequences  of  the 
shutdown  need  to  be  identified,  the  critical  time  of  shutdown  determined,  how  the  data  in 
a  transactional  state  to  be  preserved,  and  what  obligations  are  forfeited  to  the  user  or  to 
the  mission  objective  by  network  shutdown.  It  is  most  critical  to  recover  the  system 
before  determining  the  cause  because  it  often  becomes  apparent  only  during  recovery, 
and  not  while  in  a  shutdown  state. 

Architecting  for  a  survivable  network  requires  an  architecture  that  allows 
no  single  point  of  failure,  a  system  of  continuous  trust  verification  between  nodes  of  the 
network,  protocols  that  define  knowledge  between  nodes,  and  specific  services 
accomplished  in  a  single  node  does  not  significantly  detract  from  the  network’s  overall 
mission  should  that  node  fail  (Ellison  et  ah,  1999).  Work  continues  on  determining  where 
in  the  network  layer  the  encoded  trust  information  resident  in  the  network’s 
communication  protocol  can  be  interpreted,  on  what  are  the  cost  differentials  with 
packaging  server  state  information  into  protocols  to  verify  trust  and  the  availability  of 
critical  processes.  Much  research  has  been  done  at  Cornell  University,  as  shown  in  the 
description  below,  about  work  completed  on  recognizing  errant  network  behavior,  on  the 
study  of  how  network  processes  are  ordered  and  how  the  network  reacts  when  one 
process  fails. 


b.  Fault  Tolerance 

One  of  the  important  aspects  of  recoverability  in  a  survivable  network  is 
fault  tolerance.  This  is  the  ability  of  the  system  to  withstand  failure  and  keep  critical 
elements  functioning  mainly  through  the  mechanism  of  redundancy.  If  it  is  desired  to 
lower  the  risk  of  losing  a  critical  capability,  designing  in  a  fault  tolerant  capability 
supports  the  systems  capability  to  withstand  attack  and  damage  to  the  network  and  allows 
the  continued  access  to  critical  services.  Fault  tolerance  improves  system  reliability, 
availability,  dependability,  and  safety.  Redundancy  in  network  components  not  only 
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protects  the  system  from  intentional  attacks,  it  also  improves  reliability  in  the  face  of 
unintentional  accidents  and  equipment  failure.  The  key  to  fault  tolerance  is  the  ability  to 
architect  in  reliability  because  “  ...  in  all  practical  systems,  reliability  is  of  great 
importance,  and  any  system  that  is  unreliable  is  likely  to  be  unsuccessful  even  if  it  is 
safe”  (Storey,  1996,  p.  114).  Here  again,  the  emphasis  on  architecting  a  system  for  use 
and  availability  by  increasing  reliability  is  seen;  one  of  the  main  drivers  in  availability, 
and  balancing  system  and  personnel  protection  with  system  utilization. 

Architecting  fault  tolerance  requires  understanding  the  nature,  duration, 
and  extent  of  the  expected  fault.  The  nature  of  a  fault  can  be  random  or  systemic. 
Random  failures  generally  are  generally  associated  with  hardware.  Software  most  often 
exhibits  a  systemic  failure  because  the  software  cannot  become  obsolete  or  degrade  like  a 
physical  component,  and  software  failures  are  evident  in  their  specification,  coding, 
logic,  or  variables.  The  fault  duration  can  be  permanent,  intermittent,  or  transient.  Many 
software  faults  exhibit  an  intermittent  failure  and  it  can  be  difficult  to  find  the  cause.  The 
extent  of  a  fault  can  be  localized  or  system-wide. 

To  deal  with  faults  and  design  fault  tolerance,  system  models  can  be 
constructed  to  analyze  failure  modes  and  make  an  assessment  of  the  effect  of  faults  on 
the  system  given  the  nature,  duration  and  extent.  Hardware  (especially  computer  memory 
and  processing  ships)  is  usually  modeled  with  the  “single  stuck  at,”  “bridging,”  or  “stuck 
open”  model  of  system  operation,  and  analyzed  with  the  Failure  Mode  Effect  and 
Criticality  Analysis  (FMECA)  method.  These  methods  and  models  are  important  to 
analyze  during  the  architecting  of  a  networked  system  at  the  physical  layer.  As  discussed 
in  the  section  on  risk  methodologies,  computer  systems  and  network  software  faults  are 
analyzed  using  fault  tree  analysis  and  its  transposition,  attack  tree  analysis.  System 
software  faults  are  difficult  to  detect  as  the  test  vectors  for  the  software  process  can  be 
overwhelming.  Often  software  is  assumed  to  have  some  faults,  which  are  never  detected 
and  are  tolerated.  With  this  assumption,  it  is  important  that  the  system  be  architected  for 
fault  tolerance  during  design.  It  is  far  easier  to  discover  and  remove  software  faults 
during  the  development  stage  of  the  system  life  cycle  then  during  operation.  However, 
this  method  of  fault  coverage  can  become  problematic  under  the  open  architecture 
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scheme  that  uses  COTS  components.  The  purpose  of  using  COTS  in  system  development 
is  to  take  advantage  of  the  testing  already  eondueted  by  the  commercial  vendor,  and  to 
use  their  usage  data  during  system  operation  for  reliability  determinations.  Often, 
however,  COTS  software  contains  either  proprietary  programs  or  lines  of  code  so  large 
that  it  is  prohibitive  to  double  cheek  the  reliability  of  the  program,  or  to  ascertain  if  the 
software’s  coding  is  embedded  with  any  malicious  operators  (Storey,  1996;  Anderson  & 
Hundley,  1998). 

Open  architecture  is  one  good  reason  to  build  in  fault  tolerance.  The  other 
is  to  mitigate  the  effeets  of  an  attaek  on  the  network,  and  to  support  system  recovery  by 
arehitecting  the  system  to  be  survivable.  To  build  in  fault  tolerance,  the  system  is 
designed  with  redundancy.  However,  the  redundant  design  has  to  be  smart.  Limited  by 
resourees,  bandwidth,  power,  spaee,  or  the  environment  in  which  the  system  operates,  a 
system  presumably  cannot  be  made  with  an  infinite  amount  of  redundancies  such  that  it 
contains  an  infinite  set  of  duplieate  components  with  which  to  switch  over.  Additionally, 
for  systemie  failures,  a  redundant  component  of  a  eomponent  with  a  systemic  failure  also 
fails  the  same  way  and  is  redundant  in  the  true  sense  of  the  word;  not  neeessary.  For 
systemic  failures,  the  redundancy  needs  to  be  diverse  so  as  not  to  duplicate  the  systemic 
error  in  the  primary  eomponent.  The  architeeture  of  the  redundaney  is  also  determined  by 
whether  the  system  is  eomposed  of  hardware,  software,  information,  or  time. 

•  Hardware  is  classieally  made  fault  tolerant  by  triple  modular  redundaney; 
three  components  that  perform  the  same  function  and  are  switched  on  or 
off  by  a  fault  sensor,  or  have  their  outputs  auetioneered  to  seleet  the 
component  with  the  valid  output. 

•  Software  programs  can  be  duplieated  to  produee  the  same  output  but  by  a 
different  proeess  to  eliminate  system  errors  in  the  program.  For  an  attaek 
on  the  network,  it  is  neeessary  to  ensure  the  backup  program  is  in  a 
different  loeation  and  eannot  be  eorrupted  by  the  same  threat  agent  that 
eorrupts  the  primary  program. 

•  Information  is  made  fault  tolerant  by  duplieate  repositories  of  data, 
stacking  information  on  top  of  the  data  to  ensure  its  validity  sueh  as  check 
sums  or  indexing  sehemes  sueh  as  hash  functions  and  tables. 

•  Timing  funetions  for  program  execution  can  be  used  to  ensure  a  fault 
toleranee  from  intermittent  failures  sueh  that  the  program  or  serviee  is 
timed  to  execute  when  the  proeess  fault  is  recovered.  There  are  many 
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timing  schemes  for  program  exeeution  operating  or  underdevelopment 
such  as  cloud  computing  and  multicasting  programs  (Birman,  2009). 

Once  the  architecture  of  the  redundancy  has  been  determined,  it  is 

neeessary  to  arehitect  the  system  for  fault  detection.  For  network  hardware,  the  simplest 

arrangement  is  called  masking,  which  does  not  actually  detect  a  fault,  and  basically 

allows  a  redundant  system  to  pick  up  the  load.  Dynamic  systems  sense  faults  by 

examining  output  and  eomparing  it  to  a  desired  output.  Upon  fault  detection,  the  system 

switches  to  a  redundant  component  to  eontain  the  existing  fault  and  reconfigure  the 

system  for  eontinued  fault-free  operation.  There  are  many  arrangements  to  arehiteet  the 

system  for  fault  detection  and  tolerance,  but  a  detection  system  inserts  another 

component  into  the  system  that  must  be  analyzed  for  its  reliability  to  not  let  a  system 

fault  go  undetected.  Obviously,  redundancy  in  detection  components  should  reduee  the 

risk  that  a  valid  fault  may  be  missed  or  a  false  detection  of  a  non-existent  fault  induees 

unneeessary  action. 

Software  fault  detection  is  complicated  by  the  fact  that  faults  in  software 
are  always  systemic.  As  mentioned  above,  architecture  for  fault  tolerance  in  software  is 
achieved  by  diversity,  or  what  is  called  N-version  programming.  Several  versions  of  the 
software  run  eoncurrently  on  one  or  more  proeessors,  and  their  outputs  are  compared  for 
similarity.  With  only  two  versions,  if  a  difference  is  deteeted,  it  is  diffieult  to  tell  which 
version  is  correet,  so  the  system  must  perform  further  diagnosties  to  determine  the  correet 
version.  The  other  software  deteetion  seheme,  ealled  the  recovery  block  (Storey,  1996),  is 
to  run  diagnostics  continually  on  all  program  versions  to  eheck  for  issues  such  as  run¬ 
time  errors,  math  errors  or  reasonability.  However,  to  make  a  software  system  fault 
tolerant,  the  system  needs  to  reeognize  and  fix  a  system  state  condition.  To  detect  a  fault, 
a  fault  must  occur.  In  software,  the  faulty  exeeution  ehanges  the  system  state  before  the 
redundant  program  assumes  eontrol.  To  fix  this  situation,  the  state  of  the  system  before  a 
failure  needs  to  be  known  and  saved  somewhere  so  the  system  can  recover.  Mueh 
research  is  being  condueted  on  ways  to  save  and  reset  system  states  without  having  to 
make  a  copy  of  system  states  continuously.  One  example  is  the  new  multieasting 
technique  over  a  layered  network.  Markov  non-time  dependent  models  of  the  system 

state  ean  be  used  to  determine  the  risk  due  to  a  software  fault  (intentional  intrusion),  and 
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can  be  used  to  inform  the  arehiteeture  of  the  system  to  control  the  eomponent  and  sub¬ 
system  state  changes  and  reeovery  to  the  operating  state.  Software  reliability  predictions 
can  be  difficult  to  assess  sinee  some  parts  of  the  software  are  used  infrequently,  and  a 
fault  to  an  intermittently  used  portion  of  the  program  may  not  show  a  fault  for  quite  a 
while  into  system  operation.  For  this  reason,  testing  of  all  possible  program  execution 
cases  is  important,  and  if  too  many  oases  exist  to  make  it  eoonomioally  feasible,  a  test  and 
verifioation  of  those  exeoutable  programs,  whioh  impinge  on  oritioal  operations,  must  be 
tested. 

D,  SURVIVABILITY  MODELING 

1.  Reactive  Risk  Analysis 

Reactive  risk  analysis  (Hamdi  and  Bordiga,  2005,  pp.  783-785)  examines  the 
resilience  of  a  network  once  an  attack  is  underway  by  analyzing  the  network’s  reaction  to 
the  attack  given  a  set  of  attack  detection  devices  and  countermeasure  mechanisms  to 
resist  attack  and  recover  the  network.  The  reaction  to  the  attack  depends  on  the  following. 

•  The  type  of  attack  (from  attack  tree  analysis) 

•  The  number  and  type  of  intrusion  detection  devices  and  their  efficiency  at 
detecting  true  intrusions  and  ignoring  false  alarms 

•  The  number  and  type  of  countermeasures  installed  and  their  capability  at 
resistance  and  system  recovery 

A  consideration  in  deciding  how  resilient  to  make  the  network  is  the  cost  of 
detecting  and  reacting  to  an  attack  weighed  against  the  benefit  of  detection  and  reaction. 
The  benefit  is  highly  dependent  on  the  capability  retained  and  the  functionality  or  data 
saved  or  recovered,  and  the  effect  of  the  loss  before  recovery  on  the  ultimate  objective  of 
the  network  at  the  time  of  attack. 

An  intrusion  detection  system  (IDS)  is  most  often  composed  of  a  sensor  and  an 
analyzer.  The  sensor  uses  either  pattern  or  behavior  recognition  to  tell  that  an 
unauthorized  node  has  penetrated  the  network  and  alerts  the  analyzer  to  attempt  to 
capture  the  intruder’s  parameters  (packet  header  fields  and  other  metric  data).  Upon 
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inspection,  the  analyzer  makes  the  determination  from  the  captured  data  whether  or  not 
this  is  an  intruder  so  that  the  other  countermeasures  can  take  appropriate  action  to  resist 
the  penetration,  and  recognize  and  recover  any  lost  data  or  functionality. 

The  authors  developed  a  cost/benefit  model  that  could  be  used  to  conduct  an 
analysis  of  different  types  and  levels  of  IDS  and  countermeasures  to  support  the  decision 
of  how  resilient  a  network  should  be  given  the  level  of  funding  and  other  resources 
available.  The  model  formulation  is  expanded  in  Appendix  C.  One  unique  factor  claimed 
by  these  authors  is  the  use  of  an  attack  progression  factor  in  the  analysis.  Depending  on 
the  type  of  attack,  the  impact  of  an  intrusion  can  be  constant  or  can  grow  linearly  or 
exponentially  over  time  as  the  intrusion  progresses.  How  the  attack  is  countered  and  the 
resource  costs  to  deal  with  the  intrusion  depend  on  the  characterization  of  the  initial 
impact  as  well  as  what  effect  the  attack  has  over  the  time  the  attack  is  active  until  it  is 
stopped  and  system  recovery  has  been  initiated. 

The  four  elements  of  the  reactive  model  are  as  follows. 

•  Cost  of  detecting  the  attack 

•  Cost  of  reaction 

•  Impact  of  the  attack  (when  the  progression  factor  applies) 

•  IDS  efficiency 

The  utility  of  this  model  is  helpful  in  an  analysis  of  the  costs  of  IDS  and 
countermeasures  from  a  comparative  standpoint  if  it  is  possible  to  come  up  with  plausible 
values  to  insert  into  the  model’s  variables.  While  the  progression  factor  is  an  interesting 
and  expanding  concept  for  the  model,  determining  an  expression  or  function  that  captures 
all  the  dynamics  of  the  progression  of  an  attack  is  probably  not  realistic. 

2.  Modeling  the  Recovery  Phase  of  a  Survivahle  Network 

Heegaard  and  Trivedi  (2009)  develop  a  model  to  quantify  the  survivability  of  a 
telecommunications  network  by  examining  the  virtual  connections  state  and  capacity 
between  peering  nodes  to  maximize  throughput  and  minimize  delays  when  the  network  is 
under  failure  from  intentional  intrusion,  natural  disasters  or  failed  states.  Their  approach 
to  the  steady  state  model  is  similar  to  a  model  done  by  Chen/Garg/Trevidi.  that  looked  at 
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rate  of  frame  drop  in  steady  state  and  transient  losses  due  to  faults  in  a  wireless  ad-hoe 
network  (Chen,  Garg,  &  Trivedi,  2002).  They  eombine  a  continuous  time  Markov  chain 
(CTMC)  model  with  traffic  queuing  models  for  the  steady  state  network  availability. 
Then,  they  use  several  different  models  for  propagation  of  failures  due  to  undesirable 
events  and  quantify  the  network’s  recovery  cycles. 

The  authors  contend  that  making  a  network  more  survivable  is  accomplished  by 
three  actions;  i)  preventative  measures  (stop  the  attack  before  it  starts),  ii)  designing  in 
enough  spare  capacity  and  sufficient  diversity  to  make  the  network  fault  tolerant,  and  iii) 
developing  and  configuring  proactive  and  reactive  traffic  management  techniques  and 
protocols  (equally  applicable  to  data  networks  and  wireless  networks  as  traffic 
management  is  a  vital  attribute  of  any  efficient  network).  These  models’  utility  is  best 
captured  in  the  risk  and  survivability  analyst  by  informing  the  analyst  possible  network 
attributes  and  software  functions  that  pose  a  major  impact  on  survivability  and  where  to 
place  an  emphasis  on  designing  in  a  more  robust  survivability  capability  such  as 
improved  data  replication  or  diversity  of  components. 

The  Heegaard  and  Trivedi  model  quantifies  the  virtual  management  of  network 
traffic  for  survivability  and  discusses  how  it  is  accomplished  with  changes  to  traffic 
routing  requirements,  traffic  loads,  and  capacity  changes  due  to  random,  non- 
synchronous  service  requests.  The  undesired  events  cause  failures  to  nodes  and  their 
communication  links  (arcs  between  nodes),  which  reduce  network  resources  of 
bandwidth,  memory,  and  processing  speed  and  capacity.  Recovery  is  accomplished 
through  rerouting  and  restoration  of  the  failed  nodes  and  links.  That  these  models  are 
being  considered  by  design  engineers  supports  evidence  that  efforts  to  design  traffic 
management  techniques  to  improve  survivability  in  networks  are  plentiful.  Work 
continues  to  improve  on  the  techniques  of  traffic  and  process  flows  through  different 
layers  of  a  network  as  is  being  done  in  work  at  Cornell  University  on  a  “Virtual 
Synchrony”process.  This  technology  uses  the  multiple  processes  handling  of  the  Client- 
Server  Object  Request  Broker  (CORBA)  middleware  in  a  fault  tolerant  architecture  on 
top  of  multicasting  techniques  of  network  traffic  management  (Birman,  2009). 
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3,  Characteristics  of  Dynamic  Mobile  Networks 

In  their  work  on  describing  and  simulating  dynamic  mobility  networks,  Scherrer 
Borgnat,  Fleury,  Guillaume,  &  Robardet,  (2008)  look  at  how  networks  change  over  time 
with  new  nodes  attaching  and  failed  nodes  being  removed  from  a  network.  Through 
analytical  models  and  simulation  of  a  mobile  communications  network  attached  by 
Bluetooth®,  they  concluded  that  link  creation  and  deletion  is  independent  of  one  another 
and  can  be  modeled  by  a  Markov  process,  but  the  interaction  within  the  network  is 
random  and  characterized  by  the  creation  of  communities  that  intensify  the  data  flow 
rates  within  the  communities  at  statistically  insignificant  and  non-correlated  ways,  which 
complicates  the  modeling  of  the  network  flow  traffic  within  a  large-scale  system.  Since 
the  activity  of  communities  changes  randomly  over  the  temporal  scale,  a  true  measure  of 
information  loss  probabilities  is  difficult  to  simulate,  whereas  link  creation  and  deletion 
can  be  formulated  and  simulated.  That  communities  have  unpredictable  data  flow  rates 
and  link  connection  creations  between  them  seems  intuitive  as  human  activity  that 
requires  a  network  to  control  an  event  or  meet  a  schedule  most  times  relies  on  external 
events  independent  of  the  network’s  activity  level  or  its  steady  state  processing  of 
information. 

E,  SUMMARY 

A  network  designed  for  survivability  and  resiliency  is  one  that  can  continue  to 
provide  critical  services  while  under  attack  and  has  enough  resistance  and  adaptable 
recovery  to  return  to  its  normal  state.  Survivable  networks  are  architected  to  recognize  an 
attack,  resist  the  attack  to  the  greatest  extent  possible  and  recover  quickly  after  the  attack 
has  been  stopped  or  thwarted.  Networks  designed  for  resiliency  are  capable  of  immediate 
adaptation,  shifting  the  workload  in  the  network  from  the  failed  modes  to  the  operable 
nodes  with  little  or  no  effect  on  the  end-to-end  capability  of  the  network.  An  example  of 
resiliency  in  a  network  is  the  adaptation  of  Unmanned  Ariel  Systems  (UAS)  to  pass 
surveillance  missions  between  sensor  platforms  when  they  are  networked  with  a  resilient 
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middleware  eommunieations  paekage,  allowing  uninterrupted  intelligenee  to  flow  to  the 
field  unit  depending  on  it  (deJong,  2009).  A  survivable  and  resilient  network  provides 
defense  in  depth  for  the  operation  of  eritieal  network  systems. 

Survivability  is  seenario-driven,  and  deflning  survivability  requirements  with 
whieh  to  build  a  survivable  network  system  is  ehallenging.  For  this  reason,  the  arehiteet 
must  look  at  the  network’s  boundaries,  the  interfaee  to  other  networks,  and  deflne  where 
the  line  is  drawn  to  resist  attaeks  from  threats  that  eome  in  various  ways  with  an  array  of 
eapabilities.  Onee  an  attaeker  has  penetrated  a  network,  the  arehiteet  must  look  at  the 
eapability  of  the  system  to  adapt  and  reeover  while  stopping  the  attaeker’s  progression. 
Adaptability  has  to  be  built  into  a  system  on  initial  design.  Unlike  manned  systems  that 
ean  adapt  with  human  intervention,  network  systems  require  adaptation  in  fraetions  of 
seeonds,  through  eomplex  software  logie  and  must  be  able  to  do  this  automatieally. 

Quantitative  models  of  network  operational  oharaeteristies  and  attributes  are 
informative  for  measuring  a  network’s  eapability  to  adjust  one  parameter  under  failure 
from  a  single  attaek  seenario.  When  designing  a  system  for  survivability,  these 
measurements  may  be  useful  in  the  evaluation  of  tradeoffs  between  different  system 
arehiteetures  as  long  as  the  parameter  under  study  is  refleetive  of  a  quality  required  to  be 
maintained  in  the  network  during  attaek  to  ensure  mission  objeetives  eontinue  to  be  met. 

Some  eonelusions  follow  that  ean  be  made  with  respeet  to  survivability  modeling. 

•  The  ability  to  quantify  the  survivability  of  a  network  is  an  important  step 
in  determining  the  reliability  and  availability  of  a  network  under  attaek.  As 
the  network  grows  in  size  and  eomplexity,  quantifying  survivability 
beeomes  quite  eomplieated  as  the  eonditions  and  eonstraints  on  the  model 
beeome  more  detailed  and  interdependent. 

•  The  models  above  are  not  proeess  models  per  se  but  are  system  models 
that  formulate  the  problem  of  survivability  mathematieally  or 
systematieally  and  ean  be  used  to  solve  for  a  quantitative  number 
representing  survivability  by  optimization  teehniques,  probability 
ealeulations,  or  simulation. 

•  The  first  step  in  quantifying  survivability  of  a  network  is  to  determine  the 
eharaeteristies  of  the  network  under  normal  steady  state  operation.  The 
elassie  model  for  steady  state  network  node  eharaeteristies  is  a  eontinuous 
time  Markov  ehain,  with  information  arriving  at  a  node  in  a  Poisson 
distribution  and  serviee  through  the  node  proeessing  in  an  exponential 
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distribution.  There  are  many  situations  where  the  assumption  of  these 
distributions  is  not  aeeurate  enough  to  provide  a  reliable  representation  of 
the  network’s  operation,  and  care  must  be  taken  to  understand  the 
information  flow  characteristics  within  the  network,  how  they  are 
changing  over  time,  and  the  most  accurate  depiction  of  the  network’s 
handling  of  information  from  sender  to  receiver. 

•  Survivability  does  not  consider  the  probability  that  an  attack  is  made  on 
the  network;  the  attack  has  already  commenced,  and  the  analysis  of 
survivability  deals  with  how  the  network  handles  the  attack  once  it  has 
begun. 

•  Survivability  analysis  must  balance  the  costs  of  designing  the  system  for 
survivability  with  the  cost  in  functionality  reduction  to  implement 
survivable  features.  Some  survivability  features  have  direct  costs  of 
technology  development  and  installation,  monitoring,  and  maintenance, 
but  does  not  have  an  opportunity  cost  in  lost  functionality  as  the 
countermeasure  does  not  inhibit  normal  operation,  such  as  an  intrusion 
detection  system  (IDS)  or  device,  although  one  opportunity  cost  that  might 
arise  from  an  IDS  is  the  reliability  of  the  IDS  to  minimize  false  alarms. 
Other  survivability  features  may  involve  an  opportunity  cost  as  in  traffic 
management  schemes,  which  lower  throughput  below  that  which  is 
capable  without  this  feature.  A  direct  cost  may  be  incurred  if  throughput 
functionality  is  important  and  traffic  management  restrictions  are 
mitigated  by  larger  bandwidth.  Paramount  should  be  the  cost  of  losing 
capability  in  an  attack  scenario  if  survivability  is  not  designed  into  the 
network,  and  the  risk  of  losing  the  network’s  capability  for  an 
unacceptable  period  of  time  because  recovery  and  rerouting  were  not 
designed  into  the  network  in  the  first  place. 

•  Quantification  of  survivability  requires  the  measurement  of  some  network 
value  of  interest  (packet  loss  rate,  delay  in  packet  arrival).  The  formulation 
of  the  model  must  have  a  basis  in  truth  or  in  experimental  data  to  back  up 
the  assumptions  as  to  how  the  changes  in  these  key  parameters  are 
affected  by  a  network  failure  from  attack. 

•  Network  configuration  and  network  operating  characteristics  change 
depending  on  the  applications  being  used,  the  number  and  connectivity  of 
network  members  accessing  services  or  communicating  with  other 
members.  While  node  behavior  can  be  predicted  on  knowledge  of  the 
node’s  make-up  and  what  the  node  is  designed  to  process  or  pass,  the 
behavior  of  the  members  can  be  random  and  unpredictable.  Survivability 
must  take  into  account  the  user  population  and  the  typical  behavior 
patterns  of  the  typical  user  and  how  that  behavior  might  change  based  on 
the  external  environment  to  which  the  user  is  exposed. 
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Obviously,  models  that  assist  in  quantifying  a  network’s  survivability  support 
management  of  risk  to  a  network.  However,  what  models  like  these  do  not  provide  is  the 
answer  to  the  question  of  what  value  of  survivability  is  required  to  meet  the  mission. 
Proeeeding  to  the  next  step,  a  detailed  mission  deseription  must  be  artieulated  that 
speeifies  the  parameters  a  network  must  meet  during  network  failure  reeognition  and 
reeovery.  Without  a  elear-eut  mission  effeetiveness  requirements  and  speeifieation,  the 
eapability  of  the  network  to  survive  a  given  attaek  is  meaningless.  For  instanee,  in  the 
Chen  &  Trivedi  wireless  network  model,  survivability  is  the  eapability  of  the  network  not 
to  lose  too  much  information  from  the  sender  to  the  receiver  and  to  have  them 
communicating  once  again  as  quickly  as  possible.  However,  it  has  not  been  determined 
what  the  minimum  values  of  packet  loss  or  of  packet  delays  required  to  accomplish  a 
given  task,  or  what  the  task  profile  for  the  network  is.  Survivability  analysis  must  include 
a  definitive  description  of  system’s  goals.  For  instance,  in  a  commercial  wireless 
telecommunications  network,  it  would  be  helpful  to  know  the  necessary  recovery  time 
needed  to  keep  from  losing  too  many  customers  due  to  dropped  calls  and  the  probability 
that  the  network  is  able  to  meet  that  minimum  network  service  quality  under  attack  from 
the  most  likely  threat  or  the  most  likely  failure.  Also  not  specified  in  the  model  is  what 
information  is  lost;  or  is  there  a  priority  in  information  content  that  dictates  what  can  be 
dropped  and  what  must  make  it  through  the  network?  If  the  network  drops  a  connection 
for  only  a  minute,  the  information  lost  may  or  may  not  be  important,  but  it  is  not  possible 
to  tell  from  these  models  or  to  quantify  that  attribute.  While  the  models  represent  the 
system  well  under  a  given  attack,  most  network  models  use  the  dropped  link  or  failed 
node  as  the  basis  for  model  development.  Other  problems  could  be  inflicted  on  the 
network  such  as  the  rerouting  of  sensitive  information,  or  the  distortion  of  information 
without  sensing  a  connectivity  problem,  which  would  prompt  a  node  switch.  It  is  unclear 
how  this  situation  would  be  handled  by  these  models.  As  the  number  of  types  of  threats 
increase,  the  solution  to  the  optimization  problem  becomes  quite  complex. 
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VI.  CONCLUSIONS 


A.  SUMMARY 

The  Department  of  Defense  and  the  military  serviees  are  well  on  their  way  in 
establishing  information  dominance  enabled  by  network  connectivity  as  the  centerpiece 
and  dominant  factor  in  the  defense  and  warfare  strategy  of  the  present  and  future.  The 
edge  over  the  enemy  is  the  ability  to  control  events  through  knowledge  of  the  situation, 
eliminating  as  much  as  possible  the  “fog  of  war.”  It  is  essential  that  this  information  be 
shared  with  the  war  fighter  in  as  clear  and  unambiguous  way  as  possible;  certifying  that 
the  information  shared  gets  to  the  war  fighter,  is  the  correct  information,  and  that  it  is 
unavailable  to  the  adversary.  It  is  equally  important  that  the  opposition  be  hampered  as 
much  as  possible  in  its  ability  to  reach  this  goal  with  their  war  fighters.  The  desired  effect 
is  to  increase  the  density  of  the  enemy’s  “fog  of  war.” 

The  DoD  CIO  says:  The  Department  of  Defense  is  transforming  to 
become  a  net-centric  force.  This  transformation  is  based  upon  the 
recognition  that  information  is  a  critical  strategic  component  that  enables 
decision  makers  at  all  levels  to  make  better  decisions  faster  and  to  act 
sooner.  Ensuring  timely  and  trusted  information  is  available  where  it  is 
needed,  when  it  is  needed,  and  to  those  who  need  it  is  at  the  heart  of  net- 
centricity.  (DoD  GIG,  2007,  Preface) 

The  U.S.  military’s  information  systems  and  the  network  enterprise  are  threatened 
on  several  fronts;  equipment  malfunction,  unintentional  mistakes  and  accidents  causing 
system  and  network  inoperability,  hazards  from  natural  causes  and  the  exposure  of 
systems  to  hostile  environments,  and  intentional  disruption  and  destruction  of  data  and 
functionality  by  many  adversaries,  each  with  their  own  motivation  for  opposing  the 
military’s  objectives  and  disrupting  the  flow  of  information. 

The  GAO  warns:  In  addition,  DOD  faces  risks  inherent  with  the  nature 
and  scope  of  the  effort  it  is  undertaking,  for  example,  risks  related  to 
protecting  data  within  the  thousands  of  systems  that  will  be  integrated  into 
the  network.  Furthermore,  the  technical  challenges  to  develop  new 
networking  and  network  management  capabilities  to  support  mobile, 
integrated  communications  are  considerable.  (U.S.  Government 
Accountability  Office,  2004,  p.  4) 
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Much  effort  since  the  start  of  the  information  age  has  gone  into  researeh, 
development,  and  testing  of  teehnologies  and  practices  that  increase  the  seeurity  of 
eomputers  and  networks.  The  need  for  an  ability  to  reeognize  the  threats  to  information 
systems  and  networks  and  to  determine  what  to  do  about  the  threats  is  paramount.  This  is 
best  aecomplished  through  a  robust  program  of  network  and  information  system  risk 
analysis  and  management.  That  it  is  important  to  take  the  holistie  view  of  network  risk 
assessment  rather  than  foeusing  on  short  term  fixes  is  clearly  articulated  by  Chittister  & 
Haimes  in  their  article  on  Cyber  security  and  the  software  lifeeyele. 

The  balance  for  aehieving  seeure  information  systems  is  tilted  more 
toward  short  term  taetical  measures,  foeusing  on  fire  walls,  patching,  and 
response  to  eyber  attaeks,  and  less  toward  long-term,  strategie  approaehes 
that  address  the  entire  software  lifeeyele  development.  (Chittister  & 
Haimes,  2006,  p.  2) 

1.  Network  Architecture 

This  thesis  examined  a  network’s  arehitecture  from  the  hardware  aspect 
(topology)  and  software  (layers  and  SOA)  and  how  eertain  arehiteetures  ereate  or 
mitigate  vulnerabilities  that  could  be  exploited  by  threats. 

Conclusion  1:  A  study  of  network  topology  from  the  standpoint  of  vulnerabilities 
should  be  part  of  the  deeision  in  the  evaluation  of  alternatives  along  with  performance 
and  eost.  Networks  arehiteeted  from  eoneept  definition  to  resist  attaek  are  more  eapable 
of  evolving  with  the  growth  of  the  network  into  a  more  seeure  posture. 

Conclusion  2;  The  decision  to  use  open  arehitecture  and  to  take  advantage  of  the 
attributes  of  a  SOA  carries  with  it  several  implications  that  can  introduce  vulnerabilities. 
One  of  the  principle  sourees  of  introducing  vulnerabilities  through  software  arehiteeture 
is  the  open  arehiteeture  and  SOA  relianee  on  COTS.  COTS  software  may  be  questionable 
as  to  the  testing  condueted  before  use  in  the  military’s  network  systems  in  addition  to  the 
possibility  that  there  may  be  hidden  software  programs  or  logic  that  can  introduce 
unknown  vulnerabilities  that  ean  appear  during  network  operation. 
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2,  Network  Risk  Management 

The  thesis  researehed  the  elements  of  risk,  especially  as  they  apply  to  networks; 
the  composition  of  a  risk  management  process,  with  a  detailed  look  at  DoD’s  risk 
management  for  acquisition  programs,  and  what  attributes  of  commercial  risk 
management  processes  might  support  the  continuing  improvement  of  a  network 
enterprise  risk  management  approach  for  DoD  Net-centric  operations  and  the  GIG 
architectural  framework. 

Conclusion  3;  Identified  risks  need  to  be  fully  assessed  for  the  potential  not  only 
to  disrupt  the  network,  but  also  the  consequences  on  the  operations  they  are  designed  to 
support.  A  rigorous  and  enlightened  assessment  of  network  risk  should  support  network 
designers  and  operators  in  making  a  decision  as  to  whether  it  is  worth  the  cost  in  funding 
or  opportunity  to  plan  and  implement  a  mitigation  strategy  for  a  given  risk.  The  answer  to 
the  question  “What  is  this  mitigation  strategy  protecting?”  has  a  direct  effect  on  the 
mitigation  strategy  employed. 

Conclusion  4:  Attack  trees  are  a  useful  tool  in  analyzing  network  vulnerabilities 
and  assessing  the  motivating  factors  of  threats  to  attack  a  network.  When  analyzed  from 
the  standpoint  of  game  theory,  attack  tress  (based  on  the  reliability  analysis  done  through 
fault  trees)  contribute  to  the  insight  of  network  vulnerabilities  as  viewed  by  the  network 
versus  by  the  threat. 

3,  Network  Survivability  and  Resilience 

This  thesis  examined  resilience  and  survivability  of  networks  and  strategies  for 
architecting  networks  for  those  qualities.  Survivability  means  a  reduction  in  susceptibility 
and  vulnerability,  and  the  ability  to  recover.  Resilience  in  a  network  is  an  abstract  quality 
that  metaphorically  means  the  ability  to  resume  the  previous  shape.  Architecting  a 
network  for  resilience  is  challenging  in  that  the  requirements  for  survivability  are 
scenario  driven.  Articulating  and  defining  a  survivability  or  resilience  requirement  can  be 
problematic. 
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Conclusion  5:  To  architect  resilience  into  a  network,  a  network  architect  must 
look  at  the  network’s  boundaries,  the  interface  to  other  networks,  and  define  where  the 
line  is  to  be  drawn  to  resist  attacks  from  threats  that  come  in  various  ways  with  an  array 
of  capabilities.  Once  an  attacker  has  penetrated  a  network,  the  architect  must  look  at  the 
capability  of  the  system  to  adapt  and  recover  while  stopping  the  attacker’s  progression. 
Adaptability  has  to  be  built  into  a  system  on  initial  design. 

4,  Network  Entreprise  Risk  Management 

This  thesis  compared  some  of  the  popular  risk  management  processes  in  the 
public  domain.  It  examined  how  their  methodology  might  support  DoD’s  network 
enterprise  risk  management  process  to  achieve  a  survivable  and  resilient  enterprise 
network  and  support  DoD  in  making  risk  assessments  and  decisions  on  the  cost/benefit  or 
value  of  the  choices  in  alternative  architecture  and  countermeasure  use  for  risk  mitigation 
implementation. 

Conclusion  6;  This  thesis  contrasted  some  of  the  popular  risk  management 
processes  in  the  public  domain.  A  common  thread  in  all  the  commercial  process  models 
is  the  necessary  involvement  of  the  entire  organization  in  the  process,  from  top  level 
management  to  the  network  administrators  and  functional  managers.  Leaders  and  the 
people  in  command  positions  need  to  take  an  on-going  role  in  the  security  and  risk 
management  of  their  most  valuable  assets,  the  networks  and  the  information  and 
functionality  contained  therein.  Risk  management  is  a  continuous  process. 

Conclusion  7:  In  the  end  analysis,  managing  risk  is  a  balancing  act.  However,  it 
is  a  process  that  is  necessary;  it  is  a  process  that  needs  to  evolve;  and  it  is  a  process  that 
needs  to  be  continuous.  Risk  could  be  eliminated  by  erecting  barriers  to  impenetrable 
potential  threats,  or  by  just  shutting  everything  down.  The  consequence  of  this  action  is 
that  network  users  would  be  unable  to  use  the  network  to  achieve  their  objectives.  The 
converse  is  to  ignore  the  level  of  risk  or  setting  the  criteria  of  an  acceptable  level  of  risk 
too  low  to  avoid  mitigating  it;  allowing  uninhibited  access  and  mobility  to  data  and 
functionality  contained  in  the  network.  This  method  is  just  as  untenable  as  shutting  down 
the  network  completely  every  time  the  network  is  attacked.  Without  mitigation,  the 
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network  would  be  inundated  with  illegitimate  members  intent  on  causing  harm  to  satisfy 
many  motives  (data  destruction,  fabrication,  and  interruption  of  critical  services).  The 
process  of  managing  risk  gets  more  complicated  as  systems  grow  in  size  and  complexity, 
and  as  the  systems  become  distributed  both  in  functionally  and  geographically.  The 
answer  is  to  conduct  a  continuous  assessment  of  the  risks  to  a  network,  to  balance 
network  capability  prudently  with  network  security,  and  applying  the  available  resources 
with  which  to  do  that  wisely. 

B,  RECOMMENDATIONS  OF  AREAS  FOR  FURTHER  RESEARCH 

It  is  the  contention  of  this  thesis  that  to  continue  to  provide  the  network  capability 
desired  in  the  GIG  enterprise  to  support  the  strategy  of  information  dominance  even 
when  the  network  is  under  attack,  the  network  must  be  designed  with  the  quality 
attributes  of  survivability  and  resilience.  Service  Oriented  Architecture,  with  its  loose 
coupling  at  the  boundaries  and  its  objective  of  software  reuse  by  making  the  service 
independent  of  the  client  through  the  use  of  description  language  and  callable  interfaces, 
offers  an  opportunity  to  examine  this  process  for  ways  to  make  the  underlying  network 
resilient  and  adaptable  to  realigning  the  service  when  the  network  is  under  attack.  Further 
research  and  analysis  should  concentrate  on  how  this  would  be  done. 

Service  Oriented  Architecture  offers  significant  advantages  in  an  enterprise  as 
large  as  the  GIG  is  envisioned  to  become.  However,  one  of  the  biggest  sources  of 
vulnerability  that  increase  the  level  of  risk  to  the  enterprise  is  the  use  of  COTS  software 
in  designing  the  network  with  open  architecture.  Considering  the  cost  savings  and  time  to 
provide  capability  to  the  war  fighter  through  the  use  of  COTS,  research  and  analysis  on 
how  to  close  the  vulnerability  gaps  created  by  using  COTS  should  be  conducted.  The  use 
of  quality  control  measures  such  as  Capability  Maturity  Model  Integration  and  others  is 
vital  to  closing  that  gap.  Determining  sufficient  software  testing  requirements  that  do  not 
take  an  inordinate  amount  of  time  or  resources  but  still  provide  the  assurance  of  quality 
would  be  a  step  ahead  in  the  military’s  ability  to  enhance  the  one  factor  that  is  under  their 
control  in  the  risk  equation;  i.e.,  reducing  network  vulnerability  in  a  network  that 
provides  invaluable  resources  to  the  war  fighters  in  their  efforts  to  reach  their  objectives. 
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APPENDIX  A.  VULNERABILITY  AND  MITIGATION  STRATEGIES 

BY  NETWORK  LAYER 


A,  INTRODUCTION 

As  discussed  in  Chapter  II,  a  network  is  a  eomplex  organization  of  physieal 
eomponents  and  eonneeting  devices  arranged  in  a  eertain  topology,  signal  paths,  software 
logie  eontrolling  the  signals,  protoeols  whieh  define  how  information  is  paekaged,  logie 
programs  eontrolling  the  routing  of  the  paekets  of  information,  paeket  addressing 
sehemes  to  obtain  the  information  from  souree  to  destination,  software  programs  for 
determining  who  reeeives  what  information,  meehanisms  to  keep  paekets  from 
interfering  with  one  another,  the  data  and  funetionality  eontained  in  paekets  sent  through 
the  network,  and  a  variety  of  other  sehemes  for  making  the  network  operate  eorreetly  and 
perform  the  funetions  desired.  The  International  Standardization  Organization  ISO  has 
deeomposed  the  operation  of  networks  into  seven  layers  in  its  Open  Systems 
Intereonneetion  (OSI)  model,  and  by  in  large,  a  plan  to  evaluate  and  mitigate  risks  to 
network  enterprises  must  eonsider  the  layer  abstraetions  to  define  the  mitigating  strategy 
properly  that  is  to  be  used  for  the  seeurity  software  logie  and  meehanisms  to  operate 
eorreetly.  The  boundaries  between  some  of  the  layers  ean  overlap,  and  indeed  network 
abstraetions  by  other  organizations  may  eombine  ISO  layers  in  their  definitions.  For 
instanee.  Figure  17  shows  how  the  Navy  has  eombined  layers  of  the  ISO  model  into  three 
layers  in  its  model  of  FORCENet;  the  eommunieations  and  network  layer,  enterprise 
serviees  layer,  and  the  applieations  serviees  layer.  In  this  diseussion,  a  eomparison  is 
made  between  the  different  layers  of  the  ISO  Open  System  Intereonneetion  (OSI)  model 
layers  for  vulnerabilities  and  the  Quality  of  Serviee  attributes  that  need  proteetion. 

This  figure  is  a  eomparison  of  the  network  information  abstraet  as  depieted  in  the 
ISO  seven  layers  with  the  layers  defined  by  the  U.S.  Navy’s  FORCENet  model  of 
information  extraetion.  A  mapping  of  the  vulnerabilities  and  mitigating  strategies  ean  be 
made  into  the  EORCENET  model  from  the  deseription  of  the  speeifie  layer  in  the  ISO 
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model.  While  the  mapping  is  not  neeessarily  direet,  as  these  abstraetion  models  overlap 
in  their  aetual  funetionality  in  an  operable  network,  a  eomparison  ean  be  made  and 
translated  between  the  two  models  elosely  enough  to  have  some  utility. 


The  risks  inherent  in  the  ISO  layers  translate  into  the  FORCENet  model 
to  inform  the  Navy’s  network  development  team  the  types  of  risk  to  look 
for  in  eaeh  of  the  three  layers  of  the  FORCENet  model. 

Figure  17.  ISO  Network  Eayers  Mapped  to  FORCENet  Network  Model  (From; 
Stewart,  2006) 

B.  NETWORK  ISO  LAYERS  AND  THE  RISKS  TO  NETWORK  QOS 
ATTRIBUTES 

1,  Physical  Layer 

Description:  This  layer  is  defined  by  the  hardware  of  the  network,  the  deviees  that 
eonstitute  a  eomputer  workstation  or  eentral  proeessing  unit  of  a  eontroller,  memory  units 
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(hard  drive,  compact  disc  driver),  network  interface  cards  (Ethernet,  modem,  SONET 
token),  electronic  switching  devices  (hub,  switch,  router),  connection  hardware  (Eire 
Wire,  Universal  Serial  Bus,  Bluetooth,),  connection  media  (optical  fiber,  twisted  pair 
cable,  wireless  radio  signals),  and  power  supplies.  The  hardware  devices  of  the  network 
are  constructed  to  prevent  collision  of  data,  to  provide  multiplexing,  and  to  form  the 
electronic  signals  in  the  shape  required  for  the  communications  medium  (packets  for 
wired  systems  and  frames  for  wireless  signals)  (Smith,  2003). 

Vulnerability:  Primarily,  the  threat  is  to  the  network’s  availability  by  physical 
damage  to  components  as  well  as  the  reliability  and  availability  of  hardware  components. 
The  threat  at  the  physical  layer  is  probably  as  likely  to  come  from  an  unintended  accident 
(component  assembly  or  proximity  to  a  hazard)  as  from  a  malicious  attack.  Interruption 
to  power  sources  can  cause  damage  to  sensitive  electro-magnetic  components  through 
electro-magnetic  interference  or  power  surges. 

Mitigation  Strategies:  As  in  other  hardware  systems,  availability  can  be  enhanced 
by  fault  tolerant  strategies  of  redundancy  and  diversification.  Physical  separation  of 
components  at  this  layer  is  advantageous  as  well  as  power  isolation  techniques  such  as 
surge  suppression.  The  selection  of  hardware  components  and  their  location  in  the  path  of 
the  network  also  affects  the  ability  of  the  hardware  to  resist  attacks.  Speed  of 
transmission  and  bandwidth  of  the  communications  medium  affects  the  ability  to  get 
volumes  of  information  to  the  right  place  within  a  time  specification.  The  physical  layer 
is  the  first  step  in  the  architecture  choices  of  a  system  since  the  components  chosen  to  do 
certain  tasks,  the  quality  of  the  hardware,  and  arrangement  of  where  they  are  connected, 
what  physical  connectors  are  used,  what  shielding  of  cabling  is  employed;  and  many 
more  architectural  choices  drives  the  cost  of  the  network.  It  also  determines  the  nature  of 
the  subsequent  layers  and  what  software  can  reside  on  the  system  components  to  perform 
the  services  architected  in  the  upper  network  layers. 
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2,  Data  Link  Layer 

Description:  In  this  layer  of  a  network,  eleetronic  signals  are  passed 
indiseriminately  node  to  node  by  frame  delivery.  These  are  the  signals  and  packets  that 
are  transmitted  through  the  physical  lines  (or  through  the  air  in  wireless  networks)  by 
data  link  protocols  such  as  ANSI  standards  of  connectivity  802.2  and  801.11.  ( 
NSA/IAD,  2005).  This  is  the  layer  in  which  the  Media  Access  Control  (MAC  address) 
number  is  assigned  to  a  particular  piece  of  hardware,  the  internal  addressing  of  a  local 
area  network.  Basic  protocols  (rules)  for  sending  signals  from  a  node  or  workstation  like 
the  multi  protocol  label  switching  (MPLS),  Ethernet  protocols,  token  ring,  and  point  to 
point  protocol.  In  this  layer,  the  network  conducts  logical  link  control  (flow  control)  with 
flat  addressing  of  information  packets  (Smith,  2003). 

Vulnerability:  Since  this  layer  is  merely  the  flow  of  information  packets  before 
they  are  guided  by  source  and  destination  programming,  the  availability  of  data  and 
services  is  the  QoS  most  threatened  at  this  layer.  However,  data  signals  that  flow  through 
the  Internet  can  be  intercepted  because  of  the  availability  of  access  for  anyone  that  can 
connect  to  the  internet.  Thus,  packets  (or  wireless  data  streams  of  frames)  could  be 
intercepted,  and  if  not  encrypted,  could  lead  to  loss  of  confidentiality. 

Mitigation  Strategies:  Since  encryption  of  data  is  not  done  on  this  layer,  the 
technology  of  service-transparent  transport  or  “tunneling”  is  normally  employed  to 
protect  the  data  streams  using  Ethernet  technology.  It  can  be  implemented  in  this  layer  or 
in  the  physical  or  network  layer  to  shield  the  data  stream  from  recognition  upon 
interception.  Eigure  18  shows  a  typical  tunneling  architecture  using  Ethernet  technology 
and  control  and  management  functions  (May,  2004).  In  addition  to  tunneling,  data 
replication  at  this  layer  can  provide  some  mitigation  to  availability  of  services. 
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Tunneling  is  a  method  to  keep  data  safe  while  in  transit  outside  the 
seeurity  proteetions  of  the  Loeal  Area  Network  when  it  is  between  souree 
and  destination  gateways. 

Figure  18.  An  Example  of  Tunneling  at  the  Data  Link  Layer  with  Ethernet  (Lrom; 

May,  2004) 

3,  Network  Layer 

Description:  The  network  layer  in  a  network  is  the  abstraetion  that  implements 
end-to-end  paeket  delivery  and  routing  through  a  network  with  addressing  via  Internet 
Protocol  (IP).  It  is  also  the  location  where  network  management  services  are  conducted 
such  as  how  the  packets  are  handled  at  each  intermediary  node  (usually  a  router).  Packets 
are  sent  from  the  source  without  establishing  a  connection  with  the  destination,  and 
routing  is  done  through  subnet  routers  for  message  forwarding  and  data  flow  control. 
However,  the  source  can  receive  acknowledgement,  and  some  error  checking  and  control 
of  the  packets  is  done  at  this  layer.  With  hierarchical  addressing  being  done  in  the 
network  layer,  this  abstraction  defines  the  path  which  the  packets  take  to  reach  their 
destination.  The  network  layer  can  make  service  requests  to  the  data  link  layer  (Smith, 
2003). 

Vulnerability:  Since  this  layer  is  where  the  network  management  takes  place,  the 
configuration,  and  performance  of  network  traffic  flow  is  monitored  and  managed  from 
this  layer.  This  is  also  a  layer  where  access  control  is  maintained  to  network  system 
components;  those  components  that  manage  network  characteristics.  At  this  layer, 
intrusion  into  the  network  affects  network  performance  characteristics  such  as 
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connectivity  between  critical  nodes,  how  traffic  is  routed  node  to  node  and  number  of 
hops  from  end-to-end.  This  also  affects  the  latency  of  data  travel,  and  on  software  control 
of  bandwidth  allocation  to  certain  users,  subnets  or  LANs.  Availability  of  data  due  to 
connectivity  issues  or  routing  issues  on  the  network  layer  is  the  major  threat.  Integrity  of 
data  can  also  be  affected,  as  intrusion  into  network  management  can  cause  excessive 
information  packet  loss  and  distortion  of  data  received  by  the  end  user. 

Mitigation  Strategies:  Maintaining  minimum  configuration  standards  for  the 
network  and  components,  especially  routers,  on  the  network  is  important  for  maintaining 
security  to  ensure  availability  of  network  paths  for  legitimate  information  flow.  To  assist 
in  security  management,  it  is  in  this  layer  that  intrusion  detection  device  characteristics 
are  abstracted  to  measure  their  performance  in  detecting  unauthorized  access  to  the 
network.  With  basic  packet  addressing  at  this  layer,  name  conventions  and  protection  of 
network  addressing  is  important  and  confidentiality  network  component  locations  are 
maintained  by  hiding  Domain  Name  Servers  (DNS)  behind  an  encryption  and 
authentication  device  and  preventing  DNS  reverse  look-up  that  gives  a  domain  name 
away.  When  the  packets  with  an  IP  address  leave  the  LAN,  the  IP  address  can  be 
repackaged  to  prevent  IP  spoofing.  Spoofing  is  a  way  for  malicious  actors  to  use 
someone  else’s  IP  address  notation  to  gain  access  to  an  unauthorized  location. 

4.  Transport  Layer 

Description:  The  transport  layer  abstraction  is  where  the  information  and  control 
packets  have  been  fully  addressed  with  transmission  control  protocol  (TCP)  or  the 
universal  datagram  protocol  (UDP)  and  prepared  for  transmission  from  one  user  to 
another.  TCP  is  a  connection-oriented  protocol.  Port  numbers  are  added  to  the  IP  address 
to  allow  access  to  ports  called  network  sockets.  The  TCP  addition  to  the  packet  address 
establishes  the  communication  between  sender  and  destination  (or  host  and  application). 
Additionally,  packets  are  assigned  ordering  numbers  to  allow  sequential  reception  and 
reordering  at  the  destination,  and  it  allows  the  ability  to  resend  packets  lost  in  congested 
nodes.  At  the  transport  layer,  the  network  seeks  to  provide  congestion  avoidance. 
Between  TCP  and  UDP,  UDP  has  a  higher  throughput,  which  means  a  shorter  latency 
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and  is  often  used  for  video  and  Voiee  Over  Internet  Protoeol  (VOIP)  where  data  drop  is 
aeeeptable,  but  redueed  latency  is  desired.  The  Hyper  Text  Transmission  Protocol 
(HTTP)  uses  TCP  for  web  browsing  (Smith,  2003). 

Vulnerability  It  is  in  this  layer  and  the  network  layer  below  that  the  distributed 
denial  of  service  (DDoS)  attack  is  launched.  With  the  ability  to  address  IP  packets  fully 
with  TCP  addresses  and  port  numbers,  the  attacker  can  target  a  network  component  for 
flooding  with  traffic  from  commandeered  workstations  called  “robots,”  and  the  attacker 
sets  up  a  network  of  these  robots  to  establish  a  “Botnet”  that  then  causes  an  overflow  of 
requests  for  service  through  a  legitimate  network  node. 

Mitigation  Strategies:  This  layer  protects  information  mainly  through  the 
establishment  of  virtual  private  networks,  an  arrangement  of  nodes  in  a  physical  network 
authorized  to  communicate  and  pass  information  end-to-end.  The  VPN  is  established  by 
node  authentication  through  security  procedures  such  as  password  recognition  and 
firewall  installations.  Another  technique  to  controlling  the  direction  and  path  of 
information  through  a  multi-node  network  is  to  establish  network  overlays  where  only 
certain  nodes  accept  signals  and  packets  of  a  certain  type  (for  instance  differentiating 
light  wavelengths  in  optical  fiber).  In  addition  to  the  software  tools  that  provide  a  secure 
socket  layer  or  SSL  (a  program  to  control  a  user’s  ability  to  “plug  in”  to  a  network)  for 
guarding  port  numbers  so  that  an  engineered  TCP  address  cannot  be  constructed,  an 
effective  mitigation  to  the  DDoS  is  the  adaptable  router  reconfiguration,  aided  by 
behavior  and  pattern  recognition  software  in  an  intrusion  detection  device  to  sense  when 
unauthorized  service  requests  are  in-coming  and  block  them.  Another  tool  that  can  be 
used  to  control  network  participation  is  addressing  packets  with  secure  IP  (IPsec). 

5,  Session  Layer 

Description:  The  session  layer  handles  requests  and  response  between 
applications  or  hosts.  The  software  controlling  actions  in  the  session  layer  handles  the 
setup  and  management  of  sessions.  In  this  layer,  authentication  and  permission  is 
controlled.  Sessions  management  software  tries  to  reconnect  the  original  connection  path 
through  the  network  when  a  connection  is  dropped  (Smith,  2003). 
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Vulnerability:  If  the  sessions  management  program  does  not  verify  the 
authorization  of  a  user  to  set  up  a  conneetion  with  an  applieation,  database,  or  other  user 
(say  for  eollaboration),  information  eould  be  stolen  or  altered,  affeeting  information 
integrity  and  confidentiality.  In  addition,  an  unauthorized  member  in  a  network  could 
repudiate  the  information  sent  by  a  broadcasted,  thus  causing  delays  in  successive 
information  being  sent  by  the  need  to  resend  information  continually  that  was  received 
the  first  time. 

Mitigation  Strategies:  It  is  critical  in  this  layer  to  gain  authentication  of  those 
users  accessing  a  session  with  an  application  or  service  so  that  information  remains 
confidential.  Security  software  programs  that  provide  authentication  and  access  control 
are  important  to  keep  the  session  from  being  disconnected  or  from  information  being 
hijacked  by  an  unauthorized  member  of  the  session.  The  session  and  transport  layers  are 
where  software  called  middleware  resides  in  a  network  system.  Programs  have  been  built 
to  control  who  is  in  a  session,  who  can  send  and  receive  information,  and  how 
information  is  broadcast  through  the  virtual  network  created  by  the  session.  Transport 
and  session  are  the  ISO  layers  that  are  aggregated  in  the  FORCENet  model  above  as  the 
enterprise  services  layer.  Network  management  programs  designed  in  the  Common 
Object  Request  Broker  Architecture  (CORBA)  are  resident  in  the  session  layer  and 
determine  who  is  able  to  retrieve  what  objects  (complete  programs  or  sets  of  data). 
Middleboxes  are  the  components  that  comprise  the  middleware  in  the  sessions  layer. 
They  use  programs  such  as  the  Network  Address  Translator  (NAT),  load  balancers 
(rewrites  packet  headers),  (Joseph  &  Stoica,  2008)  and  intrusion-prevention  devices  as  a 
type  of  firewall  at  the  interface  between  applications  and  the  network.  (DISA,  2009; 
NSA/CSS,  2009).  Service  Oriented  Architecture  often  uses  a  Distributed  Data  Services 
architecture  in  developing  programs  which  are  based  on  a  data-centric  model  to  establish 
a  loose  coupling  at  the  middlebox  interface  with  the  network  versus  an  object  model  that 
tightly  couples  the  source  node  with  the  network  management  structure  and  the  network 
operating  system  (Joseph  &  Stoica,  2008). 
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6,  Presentation  Layer 

Description:  In  this  layer,  data  packets  are  assembled  into  language  recognized  by 
the  program  using  the  data  it  has  retrieved  from  the  network  and  it  is  where  data  sent 
from  a  program  is  packaged  into  packets  for  delivery  through  the  network.  Encryption  of 
data  is  usually  performed  in  this  layer,  but  can  wait  until  subsequent  layers  before 
transmitting  the  transparency  of  the  network  (Internet).  Extensible  Markup  Eanguage 
(XME)  is  a  method  for  packaging  data  sent  from  one  program  to  another  with  a  different 
language  (NSA/CSS,  2005;  NSA/CSS,  2008).  Extensibility  is  the  quality  that  allows 
add-ons  to  permit  the  evolution  of  the  packaging  to  grow  with  changes  to  network 
configurations,  network  control  procedures  and  development  of  new  applications.  (Smith, 
2003) 

Vulnerability:  Data  integrity  and  confidentiality  is  at  risk  in  the  presentation  layer 
since  access  into  this  layer  puts  a  malicious  actor  in  contact  with  data  or  programs  before 
encryption.  On  the  other  side  of  the  interface  to  the  sessions  layer,  if  data  has  not  been 
encrypted  and  marked  up  in  accordance  with  an  organization’s  security  policy  (security 
tokens,  digital  certificate,  and  userlD/password  pair),  the  data  is  available  for  pilfering 
and/or  corrupting. 

Mitigation  Strategies:  To  provide  security  to  the  markup,  XML  has  variants  that 
attach  onto  the  XML  standard  data  representation.  Used  for  web  services  security.  Simple 
Object  Access  Protocol  (SOAP)  supports  digital  signatures  and  encryption  and  forms  the 
shell  to  carry  security  tokens.  Security  Assertions  Markup  Language  (SAME)  and  other 
security  headers  attached  to  the  data.  Extensible  access  control  markup  language 
(XACML)  allows  the  inclusion  and  encapsulation  of  security  measures  on  data  packets 
within  the  construct  of  a  security  management  and  services  architecture.  XAMCL 
executes  organizational  policy  models  on  information  packets  from  the  Attribute-based 
Access  Control  model  (ABAC)  and  Role-based  Access  Control  model  (RBAC)  to  the 
Identification-based  Access  Control  model  (IBAC),  Authentication-based  Access 
Control  model  (NBAC),  and  Authorization-based  Access  Control  model  (ZBAC) 
(NSA/IAD  SOAP,  2008).  The  last  of  these  is  compatible  with  a  data-centric  security 

model  where  the  data  and  the  network  owner  are  kept  separate,  making  this  type  of  XML 
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packaging  security  system  eompatible  with  the  goals  of  Serviee  Oriented  Arehiteeture 
and  the  eapability  to  share  information  securely  aeross  very  divergent  platforms.  Further 
protection  of  data  can  be  done  in  this  layer  (before  markup  in  XML  for  transmission  over 
the  network)  using  “hash  tables”  (Walker,  2008). 

7.  Applications  Layer 

Description:  The  top  layer  of  the  ISO  model  of  abstraetion  is  the  applieations 
layer.  This  is  where  the  services  are  performed.  Many  standard  protoeols  are  used  in  the 
applieations  layer  to  allow  aeeess  to  various  serviees.  The  more  eommon  protoeol  whieh 
allows  applieations  to  aeeess  speeifie  serviees  is  HTTP  (NSA/SNAC,  2001).  In  the 
applieations  layer,  the  network  has  finally  reaehed  the  point  where  proeessing  power  is 
aeeessed,  data  is  stored,  and  real-world  input  from  things  like  sensors  or  the  human 
interfaee  are  translated  into  digital  format  in  the  language  used  by  that  proeessor  (Smith, 
2003). 

Vulnerability:  The  applieations  layer  is  where  reeeived  data  is  proeessed  to  be 
delivered  to  the  physieal  interfaee  (e.g.,  display)  and  eontrol  of  outgoing  data  from 
storage  or  a  physieal  interfaee  (e.g.,  keyboard)  is  manipulated  for  subsequent 
transmission  through  the  network.  Sinee  data  and  programs  have  been  unenerypted  for 
use  by  the  end-user  at  this  point,  data  and  programs  are  highly  suseeptible  to  intereeption, 
and  theft  or  fabrieation.  If  a  malieious  aetor  ean  reaeh  this  point,  integrity  and 
eonfidentiality  of  information  is  easily  manipulated  for  retrieval  by  the  attaeker  or  by  the 
attaeker  modifying  ineoming  or  outgoing  information. 

Mitigation  Strategies:  If,  in  a  eollaborative  session,  loss  of  eonfidentiality  to 
information  might  be  mitigated  if  the  end  user  has  a  way  to  know  who  is  attending  the 
session,  and  even  though  eaeh  attendee  may  be  authorized,  a  eount  of  aeeess  points  might 
give  the  end  user  an  indieation  that  the  session  should  be  terminated  or  the  eonneetion  to 
the  network  severed.  Software  programs  are  also  available  to  monitor  at  the  applieations 
level  for  members  on  a  network  who  might  be  masquerading  as  an  authorized  user.  For 
survivability,  data  at  the  applieations  layer  should  be  easily  replieated  in  a  seeure  loeation 
and  program  funetionality  distributed  among  several  authorized  nodes  of  a  network  to 
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minimize  the  possibility  of  one  failed  node  stopping  the  operation  in  progress.  Equally 
important  to  survivability  is  the  diversity  in  processing  such  that  if  one  program  is 
compromised,  the  end  user  could  switch  to  a  different  mode  of  an  application  which  may 
not  be  compromised,  and  continue  operations  while  simultaneously  discovering  the 
intrusion  point  and  blocking  it. 

C.  SUMMARY 

Risk  analysis  and  management  of  risks  identified  requires  a  good  understanding 
of  the  location  of  vulnerabilities  in  the  network.  To  locate  possible  intruder  entrances 
and  paths  through  the  network  precisely,  a  thorough  understanding  of  the  layer 
abstractions  assists  in  defining  the  vulnerabilities  and  in  mapping  out  a  mitigation 
strategy  that  can  place  the  right  resources  in  the  right  layer  abstraction  for  maximum 
effectiveness.  When  considering  the  vulnerability  in  a  network,  it  is  useful  to  know 
which  architectural  abstraction  layer  or  layers  must  be  penetrated  to  compromise  the 
networks  availability  or  the  data  integrity  of  confidentiality,  and  to  know  what  the 
characterization  of  the  penetration  might  be  depending  on  where  the  attack  is  made,  and 
finally  what  resources  are  available  to  protect  the  Quality  of  Service  in  each  layer. 
Knowing  where  in  the  layer  abstraction  the  network  is  most  vulnerable,  given  the  value 
of  the  data  asset  remains  constant,  can  support  the  decision  of  where  to  concentrate 
resources  to  counter  an  attack.  Equally  important  in  computer  network  warfare,  it 
determines  the  ideal  location  to  enter  the  enemy’s  network  to  disrupt  their  operations  if 
that  is  a  campaign  objective  and  a  desired  effect. 
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APPENDIX  B.  OVERVIEW  OF  SOME  CURRENT  RISK 
MANAGEMENT  PROCESS  MODELS  AND  THEIR 
APPLICABILITY  TO  NETWORKS 


A,  INTRODUCTION 

This  appendix  explores  the  methods  of  risk  management  proeesses  for  network 
systems  used  to  varying  degrees  in  the  eommereial  and  defense  market.  As  DoD 
eontinues  to  formalize  their  proeess  of  managing  the  risk  to  the  global  network  system  in 
the  GIG  and  the  individual  networks  now  resident  in  all  military  organizations, 
ineorporating  applicable  processes  that  have  made  these  models  successful  for  other 
organizations  could  support  DoD’s  efforts  to  make  their  process  robust  and  timely. 

A  majority  of  the  models  explored  below  are  process  models  and  vary  between 
the  qualitative  and  quantitative  realms.  Qualitative  models  take  a  large  amount  of  time 
and  resources,  especially  human  resources  to  collect  and  analyze  all  the  data.  They  can 
lack  specificity  and  contain  much  subjective  information  and  opinions.  However,  when 
empirical  data  is  unavailable  or  too  difficult  to  obtain,  as  in  the  destructive  testing  of  rare 
and  limited  components,  a  subjective  model  may  be  the  only  path  to  obtain  information 
about  a  system  and  its  risk  factors.  Quantitative  models  appear  on  the  surface  to  be  more 
logical  and  fit  into  the  engineer’s  idea  of  accurate  information  and  outcomes.  Moreover, 
they  too  can  produce  misleading  results  if  the  input  data  into  the  model  is  ambiguous, 
approximate  or  inconsistent.  Ambiguous  data  can  come  from  the  misidentification  of 
possible  outcomes  or  not  fully  recognizing  all  possible  outcomes  from  the 
implementation  of  a  process  model.  Approximations  are  introduced  into  models  by  over¬ 
simplification  of  data  or  model  parameters,  assumptions  introduced,  or  idealized 
representations  (Ayyab,  2003).  In  a  system  as  broad  in  scope  and  complex  as  the  global 
connectivity  of  the  GIG  (let  alone  the  individual  services’  networked  systems),  no  one 
risk  assessment  process  model  provides  the  military  a  complete  representation  of  the 
steps  to  go  through  to  quantify  system  risk.  Nor  is  it  possible  for  one  model  provide  the 
information  necessary  to  implement  procedures  or  protection  systems  and  schemes  to 
protect  the  entire  network.  That  does  not  mean  that  the  use  of  risk  assessment  models  in 
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determining  how  to  proteet  the  GIG  and  all  the  systems  in  a  global  network  ean  be 
abandoned.  Eaeh  model  allows  some  insight  into  how  to  strueture  a  risk  management 
system,  be  it  for  the  enterprise,  for  a  speeifie  theater  of  operation,  a  eomponent’s 
partitioned  operation  in  that  theater,  a  weapons  platform  subsystem,  or  for  a  speeifie 
applieation  on  that  platform. 

B,  DESCRIPTIONS  OF  CURRENT  MODELS  AND  THEIR 

APPLICABILITY 

1,  Information  Assurance  Risk  Management  (I ARM)  (Safety  Risk) 

The  I  ARM  methodology  was  developed  in  2001  by  LCDR  E.  D.  Hernandez,  a 
Naval  Postgraduate  Sehool  student,  for  his  thesis  work  (Hernandez,  2001).  The 
methodology  draws  on  the  steps  used  by  the  Navy  for  Operational  Risk  Management 
(ORM)  as  defined  in  OPNAV  3500. 39. B;  a  framework  for  evaluating  the  risks  involved 
in  eritieal  operations  that  pose  a  safety  hazard  (e.g.,  replenishment  operations  in  heavy 
seas).  ORM  is  a  systematie  way  of  diseovering  and  ranking  risks  of  an  operation  and 
deeiding  how  to  minimize  the  risk  by  altering  the  way  the  operation  is  eondueted,  by 
taking  other  safety  measures,  avoiding  risk  by  not  performing  the  operation,  or  aeeepting 
the  risk  beeause  the  benefits  outweigh  the  risk.  lARM  is  also  patterned  after  Navy- 
Marine  Corps  Internet  (NMCI)  eomputer  network  defense  model  of  proteet,  deteet,  reaet, 
reeover  and  revise  and  the  Defense  Aequisition  University  (DAU)  five  eategory  risk 
evaluation  matrix  eard. 

The  five  steps  in  lARM  are  the  following. 

1.  Identify  vulnerabilities:  vulnerabilities  are  elassified  aeeording  to  asset 
relationship  (hardware,  software,  data,  serviees)  and  information  attributes 
that  eould  be  eompromised  (eonfidentiality,  availability,  integrity). 

2.  Assess  vulnerabilities:  a  proeess  similar  to  the  DAU  risk  eard  is  used  to 
rank  eaeh  vulnerability  identified  in  step  1  as  to  its  likelihood  and  its 
severity.  Depending  on  where  the  vulnerability  falls  within  the  matrix,  a 
unit-less  number  is  assigned  as  a  risk  level  from  1  (most  impaetful)  to  5 
(least  impaetful). 
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3.  Make  risk  decisions:  based  on  the  risk  level  in  the  second  step,  risk 
assessments  are  made  to  determine  what  would  be  an  acceptable  risk 
based  on  the  benefit,  and  what  risks  need  to  be  mitigated  or  avoided.  This 
is  the  planning  stage  for  determining  the  controls  necessary  to  manage  the 
risks  identified  in  step  2. 

4.  Implement  Controls:  Installing  the  controls,  either  technological  or 
procedural,  assigning  responsibility,  and  providing  support. 

5.  Supervise:  This  stage  is  the  feedback  loop  of  monitoring  the  results  of 
control  implementation.  It  is  also  the  time  to  observe  if  the  implementation 
of  controls  has  had  any  adverse  effect  on  the  assumptions  made  in 
previous  steps. 

The  author  cites  as  advantages  of  lARM  over  traditional  approaches  the 
systematic  nature  of  the  method,  proactive  nature  of  attempting  to  discover  all  threats  and 
vulnerabilities  (such  as  social  engineering  threats),  and  increased  communication 
between  network  users,  information  systems  technicians,  and  decision  makers  through  a 
common  language  of  the  ranking  of  risks  (Hernandez,  2001).  While  the  claim  of  a 
systematic  approach  is  justified,  the  advantage  of  capturing  all  threats  and  vulnerabilities 
may  be  inflated.  The  process  does  look  at  more  than  just  technological  controls  and 
digital  network  threats.  However,  it  is  impossible  to  capture  all  threats  and  vulnerabilities 
in  one  or  even  many  iterations  of  a  standard  process.  The  system  requirements  and 
system  capabilities  change  too  quickly  as  well  as  the  source  and  nature  of  threats  is 
continuously  evolving  with  the  developing  technologies. 

2,  Central  Computer  and  Telecommunications  Agency  (CCTA)  Risk 
Analysis  and  Management  Method  (CRAMM) 

CRAMM  is  a  process  model  owned  by  the  Government  of  the  United  Kingdom 
that  follows  a  three  step  process.  It  relies  heavily  on  qualitative  data  gathering  and 
analysis  from  subject  matter  experts  and  interviews  with  computer  system  operators.  The 
steps  in  the  process  are  the  following. 

1 .  Building  an  asset  model  and  defining  system  boundaries 

2.  Asset  dependencies  are  established  and  a  threat  and  vulnerability 
assessment  is  made  by  circulating  questionnaires.  With  the  software  tool, 
measures  of  risk  are  calculated. 
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3.  With  a  comprehensive  countermeasure  database,  the  tool  recommends  the 
countermeasures  (polieies  and  teehnical  tools)  that  should  be  implemented 
to  mitigate  the  risk. 

Data  gathered  is  plaeed  into  a  software  program  that  ean  produce  reports  to 
management  on  asset  elasses,  threats  and  vulnerabilities,  and  countermeasures  to  deploy. 
The  model  has  been  around  since  1985  and  is  difficult  to  use  unless  the  organization  has 
someone  who  has  used  the  model  in  the  past.  CRAMM  does  allow  the  organization  to 
systematieally  think  about  what  the  value  of  their  information  is  and  what  steps  need  to 
be  taken  to  address  vulnerabilities  against  known  threats.  Not  mueh  information  exists  on 
what  to  do  about  unknown  threats  and  how  to  address  them  as  they  are  identified  through 
threat  libraries  or  if  one  penetrates  the  organization^  system  (Jones  &  Ashenden,  2005). 

3,  Fundamental  Information  Risk  Management  (FIRM) 

FIRM  is  a  two-phase,  ten-stage  proeess  that  looks  at  the  information  systems  and 
networks  of  an  organization  from  a  very  high  level.  It  is  a  quantitative  data  gathering  and 
scoring  process  with  some  software  tools  for  organizing  data  gained  through 
“scoreeards,”  whieh  are  filled  out  by  resouree  owners.  The  idea  is  to  take  the  seoreeards 
and  balance  out  the  system  functionality  with  system  protections  and  produce  a  senior 
management  report  as  to  the  state  of  vulnerability  and  protection  of  the  organization’s 
system  enterprise.  The  two  phases  are  designed  to  get  senior-level  buy-in  by  showing 
ways  to  plug  obvious  but  overlooked  vulnerability  holes.  The  seeond  phase  colleets 
information  about  the  enterprise  a  second  time  with  a  more  detailed  look  at  the  score,  or 
assets  versus  threats  and  vulnerabilities.  By  virtue  of  distributing  quantitative  data 
gathering  among  all  the  organization’s  resource  managers,  this  proeess  appears  to  work 
better  in  larger  organizations.  It  does  not  take  the  next  step  of  identifying  and 
recommending  eountermeasure  implementation  (Jones  &  Ashenden,  2005). 

4.  Simple  to  Apply  Risk  Analysis  (SARA)  and  Simplified  Process  for 
Risk  Identification  (SPRINT) 

The  SARA  and  SPRINT  tools  are  eomplementary  and  the  process  is  similar  to  the 
proeesses  above  in  that  they  rely  on  oolleeting  data  through  interviews  with  management, 
system  operators  and  subjeet  matter  experts.  SPRINT  is  a  fast  traek  tool  that  assesses  the 
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business  risk,  the  threats,  and  system  vulnerabilities  and  eontrols  through  mediated 
interviews.  The  goal  after  assessment  is  to  produee  a  plan  to  implement  system  eontrols 
that  more  effeetively  reduee  the  assessed  risk.  SARA  is  designed  for  business-oritieal 
systems  and  uses  a  more  in  depth  approaeh  than  SPRINT.  SARA  uses  interviews  and 
workshops  to  1)  define  the  system  and  its  boundaries,  2)  identifying  business 
requirements  for  seeurity,  3)  assessing  threats  and  vulnerabilities  in  a  workshop  format, 
and  4)  produetion  of  an  aetion  plan.  These  tools  are  labor  intensive  and  require  the  time 
and  manpower  of  a  signifieant  part  of  the  organization  (Jones  &  Ashenden,  2005). 

5.  Cobra 

Cobra  is  a  proeess  similar  to  CRAMM  and  is  owned  by  a  seeurity  serviee  in  the 
United  Kingdom  as  well.  While  the  proeess  is  similar  to  CRAMM,  it  makes  extensive  use 
of  questionnaires  to  gather  data  and  the  software  tool  is  modularized  to  eonduet 
assessments  on  eertain  aspeets  of  seeurity  in  isolation.  For  a  full  assessment,  the  modules 
are  eombined  for  a  full  report  on  the  risk  eondition  of  the  enterprise  and  eountermeasure 
to  implement  (Jones  &  Ashenden,  2005). 

6.  The  CORAS  Method 

Another  United  Kingdom  funded  projeet,  CORAS  (an  undefined  aeronym)  is  a 
risk  management  method  that  uses  a  Unified  Modeling  Language  (UML)  in  its  automated 
tool,  whieh  supports  a  methodology  similar  to  the  proeess  models  above.  The  method 
follows  the  proeess  of  1)  identifying  seeurity-eritieal  assets  through  a  questionnaire  and 
aeeurately  representing  the  eurrent  system’s  seeurity  state  and  the  interaetion  between 
system  eomponents,  2)  risk  identifieation  by  determining  threats  (though  fault  trees  and 
Failure  Mode,  Effeets  and  Critieality  Analysis  {FMECA}),  3)  risk  analysis  through 
identifieation  of  outeomes  and  eonsequenees  and  the  likelihood  of  the  outeomes,  4)  risk 
evaluation  by  ranking  the  risks  aeeording  to  likelihood  and  evaluating  the  impaet  of  the 
eonsequenees  determined  in  step  3,  and  5)  risk  treatment  in  the  form  of  mitigation  or 
avoidanee/transferenee  strategies  by  employing  eountermeasures  to  reduee  the  likelihood 
or  reduee  the  eonsequenees.  Throughout  the  proeess  steps,  the  model  speeifies  strong 


135 


communication  between  teams  eondueting  the  analysis  and  eonstant  monitoring  and 
review  of  results  so  that  the  model  ean  be  adjusted  in  an  iterative  fashion  (Hamdi  & 
Boudriga,  2005). 

The  next  three  proeess  models  were  developed  by  the  CERT®  Coordination 
Center,  whieh  is  part  of  the  Software  Engineering  Institute,  Carnegie  Mellon  University; 
a  federally  funded  researeh  and  development  eenter  sponsored  by  U.S.  DoD.  The  primary 
objeetive  of  CERT  is  to  develop  teehnology  and  systems  management  practiees  to  resist 
attaek  to  computer  network  systems  and  to  limit  damage  and  ensure  eontinuity  of  eritieal 
serviees  when  an  attaek  on  a  network  oeeurs. 

7.  Operationally  Critical  Threat,  Asset,  and  Vulnerability  Evaluation 
(OCTAVE) 

This  process  model  and  its  associated  software  were  developed  by  the  Carnegie 
Mellon  University  and  Software  Engineering  Institute  and  use  the  workshop  format  to 
define  the  assessment  and  gather  data.  The  process  is  designed  to  bring  all  facets  of  a 
business  together  through  interdisciplinary  teams  that  look  at  risks  from  threats  and 
vulnerabilities  to  the  organization  not  only  in  information  systems,  but  in  all  facets  of  a 
business.  The  process  model  is  run  in  three  phases;  1)  asset  prioritization  through  data 
gathering,  2)  assessment  of  threats  and  vulnerabilities  and  where  these  impact 
information  flow  and  veracity  through  workshops,  and  3)  Risk  identification  and 
development  of  mitigation  strategies.  One  of  the  terminal  objectives  of  the  OCTAVE 
process  is  to  develop  threat  profiles  for  individual  assets.  This  takes  into  account  both 
threat  access  ability  (through  vulnerabilities  and  other  means)  and  a  threat  motivation 
profde.  This  method  is  unique  to  OCTAVE  over  the  other  previous  models,  but  is  a 
theme  used  in  some  of  the  later  models  examined. 
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^'^plement 

The  CERT  Coordination  Center’s  OCTAVE 
model  follows  many  of  the  eharacteristics  of 
the  DoD  Risk  Management  Process  for 
Acquisition  Programs 

Eigure  19.  The  OCTAVE  Risk  Management  Cycle  (Erom:  Caralli  &  Young,  2008) 

New  versions  of  the  OCTAVE  method  and  tool  have  been  developed  since 
OCTAVE  was  created  in  1999  including  a  small  business  version  and  a  “continuous 
improvement”  version.  The  same  methodology  of  inter-active  team  workshops  is 
employed  by  the  newest  version,  but  concentrates  on  information  asset  storage, 
transmission,  and  use  to  allow  assessments  to  be  conducted  without  professional 
knowledge  of  assessment  processes.  The  aim  is  to  broaden  the  landscape  assessed  by 
opening  up  the  assessment  process  to  operational  security,  not  just  information  security. 
The  shift  from  periodic  risk  assessment  to  a  continuous  action  was  prompted  by  the 
dramatic  changes  in  information  ownership  and  use  through  networked  systems  just  in 
the  last  decade.  New  terms  have  been  developed  including  Continuous  Risk  Management 
(CRM)  and  Operational  Risk  Management  (ORM),  although  this  is  not  a  new  term  to 
Naval  safety  programs  (Caralli  &  Young,  2008).  The  OCTAVE  developers  have 
expressed  the  need  for  a  continuous  program  in  the  following  statement. 

Operational  resiliency  in  an  organization  is  dependent  on  many  types  of 
organizational  assets  performing  together  to  achieve  a  mission.  (Caralli  & 

Young,  2008,  p.  78) 
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8. 


Architecture  Refinement  Process 


This  model,  created  by  Robert  Ellison  and  Andrew  Moore  (Ellison  &  Moore, 
2001)  from  the  Software  Engineering  Institute  of  Carnegie-Mellon  Elniversity 
(SEI/CMEl)  CERT  Coordination  Center,  is  tailored  more  toward  the  concept  definition, 
design,  and  development  stage  of  a  system  lifecycle  and  concentrates  on  architecting  a 
system  for  survivability.  Consistent  with  other  models  in  the  architecting  of  software- 
based  systems,  they  employ  the  spiral  model  from  a  systems  engineering  standpoint.  In 
fact,  the  process  is  the  successive  use  of  four  models/processes,  which  comprise  the  four 
quadrants  of  a  spiral  graph.  Counter  to  some  developers  concentrating  their  architecture 
on  the  arrangement  of  technological  security  components,  this  model  seeks  to  address 
how  the  system  architecture  counters  attacks  that  degrade  the  system’s  mission;  in  other 
words,  how  to  architect  for  survivability.  The  authors  define  survivability  similar  to  the 
definitions  of  Chapter  V  in  that  survivability  is  the  characteristic  of  a  system  to  perform 
its  designated  mission  even  when  penetrated  and  compromised  by  a  hostile  force.  The 
survivability  of  a  system  is  impacted  by  the  system’s  reliability,  performance,  safety, 
security  and  fault  tolerance. 

Similar  to  the  military’s  Observer,  Orient,  Decide,  Act  (OODA  Eoop)  the  Plan 
Decide,  Execute  (PDE  Cycle),  the  spiral  process  of  this  model  leads  the  developer 
through  architecting  the  defense  of  a  system  in  a  Resist,  Recognize,  Recover  continuum, 
which  employs  reusable  survivability  design  primitives.  Examples  of  these  design 
primitives  are  replication,  redundancy,  distribution,  separation  access  control,  intrusion 
detection,  diversity,  and  adaptive  reconfiguration.  The  reuse  of  these  primitives  comes  in 
an  iterative  fashion  as  the  architect  works  through  each  cycle  of  the  spiral.  The  four 
quadrants  of  the  spiral  are  the  following. 

1 .  Survivability  Planning;  including  mechanism-based  risk  remediation 

2.  Usage  Modeling;  essential  work-flow  analysis 

3.  Intrusion  Modeling;  using  attack  trees  and  intrusion  work- flow  models. 
(An  intrusion  work-flow  model  is  simply  a  path  through  an  attack  tree 
diagram  to  show  what  path  an  intruder  might  take  to  exploit  vulnerabilities 
in  the  networked  system.) 

4.  Survivability  Risk  Analysis;  vulnerability  and  impact  assessment 
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The  process  starts  with  designing  the  system  architecture  around  survivability 
design  primitives  based  on  the  current  knowledge  of  the  system  requirements  and  the 
adversarial  environment.  The  usage  modeling  refines  the  architecture  around  essential 
work  flows.  A  model  of  intrusion  to  the  architected  network  is  built  with  an  attack  tree 
analysis  of  possible  intrusions  and  their  cascading  effect  on  the  architected  network,  and 
finally,  a  survivability  risk  analysis  is  conducted  based  on  the  intrusion  model  to 
determine  where  the  architecture  needs  to  be  refined  for  survivability;  allowing  essential 
services  to  recover  or  continue  operating  to  meet  the  system’s  mission.  The  process  of 
this  model  is  shown  in  three  iterations:  firstly,  considering  network-based  attacks, 
secondly,  application-based  attacks  are  contemplated,  and  finally,  data-centered  attacks 
are  analyzed.  Each  iteration  considers  the  correct  architecting  of  the  system  for 
survivability  using  the  survivability  design  primitives  that  preserves  the  quality  attributes 
of  the  system  of  performance  and  reliability  (Ellison  &  Moore,  2001). 

The  penultimate  survivability  primitive  above  is  a  topic  of  some  controversy.  In 
their  article  about  the  unwarranted  concern  of  not  diversifying,  Ered  Schneider  and  Ken 
Birman  from  Cornell  University  postulate  that  diversity,  especially  in  software,  can  be 
too  expensive  and  makes  a  system  too  complex  for  the  advantage  gained  by  a  network 
attacker  who  gains  only  slightly  by  the  similarity  in  software  systems  in  a  monoculture 
environment  (Schneider  &  Birman,  2009). 

9,  Mission  Assurance  Analysis  Protocol  (MAAP) 

Another  process  model  developed  by  SEI/CMU  is  the  Mission  Assurance 
Analysis  Protocol  or  MAAP,  a  process  contained  in  the  SEI  Mission-Oriented  Success 
Analysis  and  Improvement  Criteria  (MOSAIC)  management  approach.  MAAP  is  more  of 
a  general  method  of  looking  at  distributed,  complex  systems  within  an  organization  to 
discover  the  elements  which  make  it  successful  and  mitigating  the  factors  that  deter 
success.  The  process  is  comprised  of  building  a  model  (representation)  of  the  current 
state  of  a  system  in  terms  of  its  ability  to  achieve  mission  success.  The  uncertainty 
(probability)  of  achieving  objectives  due  to  inside  or  outside  influences  is  analyzed  along 
with  the  categorization  of  the  threats  (reliability  issues,  unintended  mistakes,  intrusions) 
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that  would  hinder  success.  The  MAAP  protocol  is  a  roadmap  for  conducting  the  analysis 
by  assigning  activities,  goals,  and  expected  outcomes.  Risk  plays  into  this  model  in  the 
consideration  of  the  uncertainty  of  factors  that  would  inhibit  success.  As  with  process 
models  explained  above,  this  process  is  time  and  manpower  intensive,  with  the  formation 
and  training  of  teams  from  within  and  organizations  to  conduct  the  process.  The 
requirements  for  being  on  a  team  are  an  in-depth  knowledge  of  the  system  being 
considered  and  an  understanding  of  risk  assessment,  process  modeling,  and  statistics.  The 
qualifications  seem  to  indicate  that  a  team  with  all  members  of  this  caliber  would  be 
difficult  to  assemble  for  just  one  part  of  a  distributed  system,  let  alone  the  entire  network. 

The  end  product  of  a  MAAP  assessment  yields  a  success  profile  for  every  key 
objective  determined  during  the  operational  model  development.  Each  key  objective 
success  profile  informs  the  organization  of  their  probability  of  mission  success  in  that  key 
objective,  and  for  key  objectives  not  acceptable  to  the  organization;  a  plan  can  be 
developed  to  remediate  the  influences  causing  a  less  than  desired  success  rate. 
Interestingly,  the  operational  model  development  is  different  than  the  asset  valuation  of 
models  above  because  it  focuses  not  so  much  on  value  but  more  on  the  contribution  each 
element  in  the  distributed  system  makes  to  overall  mission  success.  The  decomposition  of 
each  key  objective  into  influences  and  their  uncertainties  requires  a  disciplined  approach 
to  determining  threats  and  vulnerabilities  (Alberts  et  ah,  2008). 

MAAP  is  primarily  a  management  process,  designed  to  take  a  holistic  view  of  a 
complex  and  distributed  system.  While  the  description  of  the  MAAP  process  does  not 
explicitly  state  that  it  is  a  risk  management  model  for  computer  networks,  the  fact  that  it 
was  developed  by  SEI/CMU  seems  to  indicate  that  the  motivation  was  to  use  this  process 
in  a  software-driven  computer  network  environment.  In  fact,  SEI/CMU  has  piloted  the 
protocol  in  a  cyber-security  incident  response  system  and  in  elements  of  software 
development  and  deployment. 

10.  Network  Risk  Analysis  Method  (NetRAM) 

This  framework  for  risk  management  was  developed  by  M.  Hamdi  and  N. 
Boudriga  of  the  University  of  Tunis  (Hamdi  &  Bordiga,  2005).  Their  motivation  in 
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developing  this  method  was  to  develop  a  struetured  framework  that  eaptures  risk 
management  approaehes,  teehniques,  and  software  tools  that  ean  be  used  by  an 
organization  to  run  an  effeetive  risk  management  program.  Their  methodology  eenters  on 
risk  management  analysis,  decision,  and  response.  The  model  consists  of  a  ten-step 
recursive  process  that  includes  pathways  to  return  and  adjust  previous  steps  based  on 
information  revealed  during  a  later  step.  Included  in  one  of  their  model  steps  are  a  robust 
process  for  incident  impact  and  response,  particularly  framing  the  response  to  maximize 
recovery  and  minimize  reduction  in  critical  services  and  safety  issues.  The  model  is 
scalable  to  different  types  and  sizes  of  network  architectures  and  topologies,  and  is 
designed  to  remain  current  with  changing  technologies  with  a  learning  process  that 
updates  quantitative  parameters  and  semantic  links.  A  set  of  modules  in  the  software  tool 
also  monitors  the  system’s  states  to  detect  deviations  from  normal  and  differences  in 
values  of  key  parameters.  One  of  the  unique  features  claimed  by  the  authors  is  that  their 
methodology  restricts  the  propagation  of  errors  or  poor  decisions  made  in  previous  steps 
of  the  methodology. 
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1 

Implemerrtalion 

Monitoring 

The  NetRAM  model  includes  a  process  for  evaluating  risk  based  on  the  networks  ability  to  react  to  attacks  that  are 
experienced;  a  measure  of  resiliency. 

Figure  20.  The  10  Modules  of  the  NetRAM  Framework  (From;  Hamdi  &  Bordiga,  2005) 
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Aside  from  the  fact  that  this  model  looks  quite  similar  to  the  ones  above  and  uses 
some  of  the  same  methodologies  for  determining  threats  and  assessing  vulnerabilities  (a 
combination  of  questionnaires  and  automated  vulnerability  scanners),  the  separation 
between  preventative  and  reactive  risk  analysis  is  one  area  that  stands  out.  While 
preventative  analysis  studies  the  likelihood  of  threats  exploiting  vulnerabilities  with  a 
resultant  undesired  effect,  reactive  analysis  considers  the  probability  of  the  detection 
system  alerting  on  a  network  intrusion,  following  the  amount  of  penetration,  and  the 
effects  on  system  operation,  particularly  critical  operations.  This  is  the  idea  behind  a 
consequences-based  risk  management  approach.  While  full  efforts  should  still  be  directed 
at  prevention,  it  is  inadvisable  to  give  short  shrift  to  planning  the  reaction  and  recovery 
from  an  unknown  threat,  left  to  exploit  the  vulnerability  that  remained  undetected  and 
unprotected.  In  addition,  an  important  element  in  reactive  risk  management  is  the  real¬ 
time  element  of  recovery  and  an  analysis  on  the  system  of  the  time  requirements  for 
recovery  to  limit  system  functionality  or  to  meet  critical  mission  objectives. 

11.  Mission  Oriented  Risk  and  Design  Analysis  (MORDA) 

The  MORDA  methodology  was  developed  to  address  the  risk  involved  in  the 
operation  of  DoD’s  Global  Command  and  Control  System  (GCCS)  from  the  war  fighter’s 
perspective.  The  methodology  employs  a  model  called  Security  Optimization 
Countermeasure  Risk  and  Threat  Evaluation  System  (SOCRATES).  The  MORDA 
methodology  and  the  SOCRATES  model  ride  on  the  foundation  of  the  following. 

•  Attack  tree  models 

•  Qualitative  information  assurance  models 

•  Quantitative  information  assurance  models 

•  Multiple  objective  decision  analysis  models  for  information  assurance; 
using  values  rather  than  alternatives  to  measure  parameters 

It  is  built  within  the  DoD’s  Information  Assurance  Technical  Eramework  (lATF), 
which  is  what  the  DoD  has  developed  as  the  architecture  for  information  assurance 
systems  implemented  on  a  DoD  network.  The  decision  to  use  the  word  design  in  the 
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model’s  name  was  prompted  by  the  developers’  opinion  that  a  system  of  risk  assessment 
and  management  of  risk  needed  to  start  in  the  design  phase  of  a  system,  not  when  the 
system  reaehed  the  operational  phase  of  the  system  lifeeyele. 

SOCRATES  is  a  quantitative  design  optimization  model  that  uses  multiple 
objeetive  deeision  analysis  as  a  mathematieal  teehnique  for  optimizing  the 
eountermeasure  design  of  a  system.  Input  into  the  model  is  data  about  the  adversary 
provided  by  threat  experts,  the  attaek  profde  provided  by  seeurity  experts  using  attaek 
tree  analysis,  and  the  eountermeasure  eharaeteristies  and  design  options  provided  by 
systems  engineers.  Input  data  is  entered  into  three  value  models;  adversary  model,  user 
model,  and  serviee  provider  model.  Using  multiple  objeetive  deeision  analysis,  value 
goals  are  matehed  to  value  measures  (eonstruetive  units  are  developed  if  no  natural  units 
exist),  and  the  value  measures  are  weighted.  The  eompeting  objeetives,  now  value 
measured  and  weighted,  are  eompared  by  a  weighted  sum  sueh  that: 

v(x)  =  ^w,v,(x,)  (1) 

i=l 


where: 

v(q 

/  =  1  to  n 


w, 


value  of  alternative 

numbered  value  measures 

seore  of  the  alternative  on  the  value  measure 

a  single-dimensional  value  of  a  seore  of  x. 

weight  of  the  value  measure  (all  weights  sum  to  1)  (Buekshaw 
et  ah,  2005,  p.  23). 


This  yields  the  optimum  alternate  within  a  set  of  eompeting  objeetives.  To  explain 
the  value  system,  an  adversary  value  model  would  use  attaek  data  based  on  the 
adversary’s  motivation  for  the  attaek,  the  adversary’s  assumption  as  to  how  likely  the 
attaek  would  sueeeed,  how  likely  the  adversary  or  the  attaek  would  be  deteeted,  the 
adversary’s  resouree  eonsumption  in  exeeuting  the  attaek,  and  the  impaet  on  the  system 
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should  the  attack  succeed.  Different  techniques  are  employed  to  produce  the  initial  value 
goals  and  measures  including  affinity  diagrams.  Some  assumptions  of  the  models  are  that 
it  is  better  to  measure  attack  preferences  instead  of  probabilities,  that  the  adversary  is  a 
rational  thinker,  and  that  an  adversary  is  going  to  try  to  maximize  their  benefit  and 
maximize  the  impact  of  their  attack  while  minimizing  resource  expenditure. 

The  user  value  model  is  structured  similarly,  and  the  value  measure  is  defined  by 
the  values  of  achieving  an  objective  of  the  war  fighter.  The  competing  objectives  are  the 
limitations  on  the  system  using  alternative  countermeasures  and  the  effect  on  mission 
accomplishment. 

The  value  models’  results  are  input  into  an  Integration  and  Analysis  model,  which 
determines  the  value  of  an  operable  system  to  the  war  fighter  considering  the  cost/benefit 
and  constraints  on  the  use  of  the  best  value  countermeasure  alternatives  determined  in  the 
value  models.  The  integration  takes  the  countermeasure  alternative  that  best  counters  the 
adversary  and  compares  it  to  the  degradation  in  value  (mission  accomplishment)  to  the 
user  community.  This,  along  with  the  service  provider  value  model  (which  is  important  to 
the  service  providers),  yields  Av(x) ;  a  comparison  of  alternatives  that  match  adversary, 

user  and  service  provider  values  to  determine  the  best  alternative.  When  processed 
through  a  cost  benefit  model,  and  evaluated  as  to  the  resources  required  to  implement,  an 
architectural  framework  of  countermeasures  can  be  developed  in  the  design  of  a  new 
system  or  as  a  modification  to  an  existing  system  (for  instance  GCCS).  The  final  step  in 
the  methodology  is  to  optimize  the  system  value  as  constrained  by  system  cost  and 
countermeasure  compatibility. 

The  authors  believe  there  are  advantages  to  their  model  and  methodology  over 
other  models  in  that  it  explicitly  describes  the  sequence  of  attacks  and  compares  the 
motivation  behind  certain  attack  strategies.  It  also  takes  into  consideration  the  conflicting 
objectives  and  the  interdependencies  of  all  the  variables,  and  sets  a  framework  for 
allocating  resources  best  aligned  for  mission  accomplishment. 
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C.  SUMMARY 

A  common  thread  in  all  the  process  models  discussed  above  is  the  neeessary 
involvement  of  the  entire  organization  in  the  proeess,  from  top  level  management  to  the 
network  administrators  and  functional  managers.  Executives  need  to  take  an  on-going 
role  in  the  seeurity  and  risk  management  process  of  one  of  their  most  valuable  assets.  “It 
is  now  recognized  that  network  security  is  a  mainstream  business  proeess,  which  can  only 
be  mandated  and  directed  by  senior  management”  (Adler  &  Lepofsky,  2000,  p.  38). 

All  of  the  models  either  allude  to  or  directly  state  that  risk  management  is  an 
iterative  and  continuous  process.  A  statie  plan,  just  like  the  teehnology  of  today,  is  out¬ 
dated  in  short  order.  Unless  the  value  of  the  information  in  the  system,  the  changes  in 
system  vulnerabilities,  and  the  threat  to  information  system  assets  and  the  operations  they 
support  is  not  continually  reviewed  for  neeessary  ehanges  to  network  teehnology  and 
personnel  polices  and  procedures,  the  value  of  the  network  rapidly  declines  as  the  system 
becomes  increasingly  susceptible  to  attaek  from  competitors  and  adversaries. 
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APPENDIX  C.  FORMULATION  OF  THE  COST/BENEFIT  MODEL 
FOR  REACTIVE  RISK  ANALYSIS 

A,  MODEL  DERIVATION 

Hamdi  and  Bordiga  developed  a  eost/benefit  model  that  ean  be  used  to  analyze  a 
set  of  Intrusion  Deteetion  Systems  (IDS)  and  reaetive  eountermeasures  to  support 
deeisions  on  the  type  and  quality  of  IDS  and  eountermeasure  required  to  eounter  a  given 
set  of  attaeks  on  a  network.  In  this  model  (Hamdi  &  Bordiga,  2005),  the  probability  of 
attaek  is  not  eonsidered,  sinee  it  is  an  analysis  of  effeets  on  the  network  when  attaeks  of  a 
given  set  are  eondueted  against  the  network.  The  analysis  provides  a  eost/benefit  analysis 
of  the  set  of  IDS  and  eountermeasures  given  the  set  of  attaeks  and  the 
effects/eonsequenees  of  the  impaet  on  the  network  from  the  attaeks.  The  analysis  gives  a 
relative  measurement  of  the  eost  and  benefit  to  network  survivability  and  resiliency  for  a 
set  of  IDS  and  countermeasures  installed  on  the  network.  The  network  architects  can 
compare  different  sets  of  IDS  and  countermeasures  that  fit  within  the  budget  to  maximize 
the  survivability  benefit  to  the  network.  The  model  uses  four  factors  to  analyze  for  the 
cost  and  benefit  of  a  given  set  of  IDS  and  countermeasures.  They  are  as  follows. 

Detection  cost  is  the  set  of  costs  for  every  IDS/analyzer  installed  on  the  network, 
whether  a  particular  IDS  is  used  against  an  attack  or  not  (an  IDS  is  composed  of  a 
network  traffic  sensor  and  an  analyzer  to  interpret  data  from  the  sensor.  In  this  analysis, 
the  authors  use  the  term  “analyzer”  to  denote  the  entire  IDS).  The  cost  is  a  given  and  is 
scaled  appropriately  for  the  units  of  the  whole  model.  Detection  cost  is  depicted  as  ,  i 

contained  in  |l,...n^}  for  analyzers. 

Reaction  cost  is  also  given  and  scaled  the  units  of  the  analysis  for  each 
countermeasure  that  provides  a  reaction  to  an  attack  and  is  depicted  as  ,  for  each 

reaction  for  all  k  contained  in  {l,...,n^}  for  countermeasures. 

Impact  of  a  given  attack  causes  various  effects  on  the  network  from  attack  a. . 
The  effects  of  the  attack,  which  represent  multiple  attributes,  are  represented  by  the  term 
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i^j  and  a  unique  term  is  developed  ealled  the  progression  faetor.  A  progression  faetor 

is  ineluded  to  ineorporate  the  eost  or  benefits  realized  by  the  type  of  progression  that 
attaek  is  eharaeterized  by,  for  instanee,  if  the  attaek  is  stopped  before  eomplete  execution 
and  the  entirety  of  the  attack  impact  effects  are  felt  on  the  system,  or  if  the  attack  is 
allowed  to  complete  execution,  the  ability  of  the  countermeasures  to  recover  the  network 
in  time  to  continue  required  operation.  The  progression  factor  is  determined  by  the 
effectiveness  of  the  countermeasures  to  stop  the  attack  and  recover  the  network.  Impact  is 
modeled  as  a  function  of  the  progression  factor  where  (l)  the  benefit  to  the 

network  from  the  progression  factor  and  (l)  is  the  maximum  damage  from  attack  Uj . 

Thus,  4^.(4)  is  the  cumulative  sum  of  elementary  impacts  of  attack  Uj  in  the  interval 

such  that; 


Lj  [Kj)  =  J I over [0,4]  (1) 

is  defined  as  the  impact  function  over  time  as  the  attack  progresses,  /(t)  can 

be  any  function  given  the  type  of  attack;  constant  /(t)  =  /q  if  0  <  t  <  4  ,  or  linear,  or  any 

function  that  defines  the  progression  of  the  impact  over  time.  In  addition,  4  is  the 

maximum  tolerated  time  the  network  is  allowed  to  be  down  and  still  be  considered 
survivable. 

Intrusion  Detection  System  (IDS)  efficiency  is  formulated  as  follows.  The 
authors  define  ‘D’  contained  in  the  binary  set  {0,  1},  as  the  probability  of  detecting  an 
attack;  0  if  no  attack  is  detected  and  1  if  an  attack  is  detected.  They  define  ‘A’  contained 
in  the  binary  set  {0,  1}  as  the  probability  of  an  attack;  0  if  no  attack  takes  place  and  1  if 
the  network  is  attacked.  Efficiency  of  the  IDS  can  be  measured  by  the  conditional 
probabilities. 
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•  P(D|A)  represents  a  true  positive,  estimated  by  sending  eontaminated 
paekets  past  IDS 

•  P(D|  A)  represents  a  false  positive 

•  P(  D|A)  represents  a  false  negative 

•  P(  D|  A)  represents  a  true  negative;  no  deteetion,  no  attaek  (Hamdi  & 
Bordiga,  2005) 

B,  MODEL  FORMULATION 

When  eonsidering  eosts,  the  authors  do  not  speeify  the  units,  but  mention  that  the 
eosts  eould  be  monetary  or  a  measurement  of  resouree  eonsumption.  The  eost  to  the 
system  of  a  reaetion  to  a  given  attaek  is  the  probability  of  deteeting  a  true  attaek  times  the 
eost  of  the  IDS  analyzer,  the  eost  of  the  reaetion  and  the  eost  of  the  impaet  as  modified 
by  the  benefits  of  the  reaetion.  It  also  ineludes  the  eost  of  the  IDS  analyzer,  the  eost  of  the 
reaetion  times  and  the  probability  that  the  system  makes  a  false  positive  deteetion. 
Considering  an  analyzer,  A. ,  that  alerted  on  attaek  Uj ,  the  cost  of  performing  response 

to  stop  the  attaek  is  sueh  that; 

r(rk,\j)  =  (/a,  +rrk  +Lj  [Kj))P[D\A)  +  {Y^^+r,,)P[D\ a)  .  (2) 

The  benefit,  ,  ^ ,  to  the  system  of  a  given  reaetion  to  a  given  attaek,  is  the 

probability  that  the  analyzer  deteets  the  attaek  times  the  benefit  defined  above  (impaet 
without  progression  faetor  minus  impaet  with  progression  faetor)  sueh  that; 

l!(r,,K,HA^-A.K))P(D\A-  0) 

In  other  words,  the  reaetion  to  the  attaek  modifies  the  total  attaek  impaet  from  the 
start  of  the  impaet  until  the  reaetion  has  stopped  the  attaek  (Hamdi  &  Bordiga,  2005). 
This  model  takes  into  aeeount  some  important  elements  of  gauging  the  survivability  of  a 
networked  system;  however,  it  ean  be  ehallenging  to  produee  empirieal  quantities  for  the 
eosts  of  IDS  systems,  the  eosts  of  reaetion  eountermeasures,  and  a  quantitative  value  for 
eaeh  element  of  an  attaek,  i^j  and  the  progression  faetor,  A^j ,  whieh  mitigates  the  effeet 
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of  the  attack.  The  value  that  is  most  likely  the  easiest  to  obtain  is  a  value  for  IDS 
efficiency  as  defined  above.  This  value  can  probably  be  obtained  through  extensive 
testing  against  known  attacks.  However,  even  this  value  may  not  be  independent  of  other 
components  and  their  use  in  a  network  system. 

Even  more  challenging  for  the  network  developer,  or  the  engineer  who  is 
designing  network  connectivity  between  existing  information  systems,  is  the  balancing 
act  between  network  costs  and  network  capability  to  meet  requirements.  As  mentioned  in 
the  main  body  of  the  thesis,  it  is  difficult  to  define  a  set  of  survivability  requirements  as 
they  are  driven  by  the  scenario  (or  attack-specific). 

Perhaps  the  value  in  a  network  cost/benefit  model  such  as  this  one  is  that  it 
supports  the  development  of  survivability  requirements.  They  can  then  be  added  to  the 
capability  requirement’s  definition  as  determined  by  the  systems  engineering  process 
used,  and  they  can  shape  the  architecture  that  defines  the  network  composition,  both  from 
a  hardware  standpoint  and  from  the  software  components  developed  or  applied  for  reuse. 
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